Over the weekend, Chinese AI company DeepSeek released an AI chat app including a “reasoning” AI model comparable to OpenAI’s o1, causing a stir among American AI companies as DeepSeek rose to the top of Apple’s App Store.
DeepSeek is a Hangzhou, China-based company providing generative AI models and AI integration. Its first products to make waves in the American market are the GPT-4-like DeepSeek-V3 and R1, an advanced “reasoning model.” Like ChatGPT, DeepSeek-V3 and R1 quickly answer natural-language prompts.
NVIDIA and Microsoft stock fell on Monday after the buzzy debut. Overall, the stock market reflected a sudden dip in confidence in U.S. AI makers. DeepSeek’s success sparked conversation about whether U.S. restrictions on Chinese access to AI chips limited or encouraged competition.
For tech professionals, DeepSeek offers another option for writing code or improving efficiency around day-to-day tasks. Along with DeepSeek’s R1 model being able to explain its reasoning, it is based on an open-source family of models that can be accessed on GitHub.
What is remarkable about DeepSeek?
Like OpenAI’s o1 (formerly known as Strawberry), the reasoning model slows down its prediction capabilities to “reason through” its work, which helps it provide more accurate answers. In particular, reasoning models have scored well on benchmarks for math and coding.
DeepSeek said DeepSeek-V3 scored higher than GPT-4o on the MMLU and HumanEval tests, two of a battery of evaluations comparing the AI responses.
DeepSeek said one of its models cost $5.6 million to train, a fraction of the money often spent on similar projects in Silicon Valley.
DeepSeek-V3 and R1 can be accessed through the App Store or on a browser. Visitors to the DeepSeek site can select the R1 model for slower answers to more complex questions. When selected, the R1 model creates lengthy answers that explain in a conversational style how it arrived at its conclusions.
As of Monday morning, the DeepSeek chat site warned service may be disrupted, though the chatbot was functioning normally.
DeepSeek also offers an APII, which operates through the OpenAI SDK or software compatible with the OpenAI SDK.
SEE: OpenAI announced Operator, an AI agent that can take multi–step actions in a web browser, such as choosing flights.
What does DeepSeek’s V3 and R1 launch mean for the AI industry?
“We can fully expect an ecosystem of applications will be built on R1 as well as several global cloud providers offering its models as a consumable API,” said Gartner Distinguished VP Analyst Arun Chandrasekaran in an email to TechRepublic. “Deepseek’s future success is predicated on its ability to continuously innovate (rather than being a one-off success), build a developer ecosystem on its products and overcome cultural barriers, given its country of origin.”
Chandrasekaran said DeepSeek’s low cost, efficiency, benchmark results, and open weights make it remarkable.
DeepSeek-V3 was trained on 2,048 NVIDIA H800 GPUs. U.S. manufacturers are not, under export rules established by the Biden administration, permitted to sell high-performance AI training chips to companies based in China.
“The potential power and low-cost development of DeepSeek is calling into question the hundreds of billions of dollars committed in the U.S,” said Ivan Feinseth, a market analyst at Tigress Financial, according to a note to clients acquired by ABC News.
DeepSeek further differentiates itself by being an open source, research-driven project, while OpenAI increasingly focuses on commercial efforts.
“Deepseek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world.,” Silicon Valley insider and venture capitalist Marc Andreessen posted on X on Friday.
Gartner said the global AI semiconductor industry will reach $114,048 in 2025. Gartner predicted the power required for data centers to run newly-added AI servers will reach 500 terawatt-hours by 2027.
DeepSeek introduces multimodal models
On Monday, DeepSeek followed up its success with another surprise: the Janus-Pro family of multimodal models, which can analyze and generate images.
OpenAI alleges DeepSeek ‘distilled’ existing models
On Jan. 29, Microsoft announced an investigation into whether DeepSeek might have piggybacked on OpenAI’s AI models, as reported by Bloomberg. Microsoft security researchers found large amounts of data passing through the OpenAI API through developer accounts in late 2024. OpenAI said it has “evidence” related to distillation, a technique of training smaller models using larger ones. Distillation violates OpenAI’s terms of service. OpenAI has not detailed the nature of the alleged evidence.
Security concerns raised about DeepSeek’s models
Since DeepSeek’s debut rocked the AI world, several security concerns about its models have swirled in the industry. Some concerns – input data feeding the model, copyright concerns, and possible disinformation or misinformation – apply to generative AI broadly; others caution U.S. users from potentially giving information to or opening a backdoor for a Chinese company.
“The technology sector needs frameworks that ensure all AI systems protect user privacy and intellectual property rights according to international standards, while recognizing the different data access and governance requirements that exist across jurisdictions,” said Cliff Steinhauer, director of information security and engagement at U.S. nonprofit The National Cybersecurity Alliance, in an email to TechRepublic. “The path forward requires balancing innovation with robust data protection and security measures, while acknowledging the varying regulatory landscapes in which AI systems operate.”
Alibaba Cloud debuts new model in the advanced AI race
On Jan. 28, Alibaba Cloud revealed Qwen2.5-Max, a generative AI model that outperforms DeepSeek’s R1 on some key benchmark tests. Like its rivals, Qwen is available in a browser called Qwen Chat and is OpenAI-API compatible. Alibaba Cloud is based in Singapore.