DeepSeek has taken the Internet by storm.
This week, DeepSeek, a Chinese AI research lab, captured the attention of the public as its chatbot application climbed to the top of the charts in both the Apple App Store and Google Play Store. The unique, compute-efficient training methods employed by DeepSeek’s AI models have sparked concerns among Wall Street analysts and tech experts regarding the U.S.’s ability to retain its dominance in the AI sector, alongside questions regarding the sustainability of AI chip demand.
But what is the story behind DeepSeek, and how has it achieved global recognition in such a short span?
The Trading Roots of DeepSeek
DeepSeek is supported by High-Flyer Capital Management, a quantitative hedge fund from China that leverages AI to make trading decisions.
Liang Wenfeng, an AI enthusiast, co-founded High-Flyer in 2015. As a student at Zhejiang University, Wenfeng began exploring trading, and in 2019, he established High-Flyer Capital Management as a hedge fund focused on the development and application of AI algorithms.
In 2023, High-Flyer set up DeepSeek as a dedicated lab for AI tool research, distinct from its trading operations. With High-Flyer as a primary investor, the lab evolved into its own entity, also named DeepSeek.
From the outset, DeepSeek established its own data center clusters for model training. However, like many AI firms in China, it has faced challenges due to U.S. hardware export restrictions. For one of its latest models, DeepSeek had to utilize Nvidia H800 chips, which are a less powerful alternative to the H100 chips available to U.S. companies.
DeepSeek’s technical workforce is predominantly young, and the company is known to actively recruit PhD-level AI researchers from prestigious Chinese universities. Moreover, they also employ individuals without formal computer science training to broaden their technological perspective across various fields, as reported by The New York Times.
DeepSeek’s Robust Models
In November 2023, DeepSeek launched its initial models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat. However, it was the release of the next-gen DeepSeek-V2 models last spring that truly caught the industry’s attention.
The DeepSeek-V2 series, capable of analyzing both text and images, excelled in numerous AI benchmarks while being considerably cheaper to operate compared to existing models. This prompted DeepSeek’s rivals in China, such as ByteDance and Alibaba, to reduce their pricing for certain models and offer others free of charge.
The introduction of DeepSeek-V3 in December 2024 further heightened the company’s profile.
According to internal benchmarks, DeepSeek V3 is said to outperform both publicly available models like Meta’s Llama and so-called “closed” models accessible solely via API, such as OpenAI’s GPT-4.
Notably, DeepSeek’s R1 model, introduced in January, has been claimed to match OpenAI’s o1 model on significant benchmarks.
As a reasoning model, R1 effectively self-verifies its answers, enabling it to sidestep common errors typically encountered by other models. While reasoning models generally take longer to process — typically seconds to minutes more than non-reasoning models — they tend to provide more reliable results in areas like physics, science, and mathematics.
Nonetheless, there are limitations to DeepSeek’s models, including R1 and DeepSeek V3. As they originate from China, they must adhere to regulatory benchmarks set by China’s internet authorities to ensure their responses align with “core socialist values”. For instance, R1 is programmed to refrain from addressing questions about Tiananmen Square or Taiwan’s political status.
A Game-Changing Strategy
DeepSeek’s exact business model remains somewhat of a mystery. It prices its offerings significantly lower than market standards, with some even available for free.
DeepSeek attributes its competitive pricing to breakthroughs in efficiency, although some analysts question the accuracy of the company’s reported metrics.
Regardless, developers have embraced DeepSeek’s models, which, while not open-source in the conventional sense, are provided under flexible licenses that permit commercial use. Clem Delangue, CEO of Hugging Face — a platform that hosts DeepSeek’s models — noted that developers have already created over 500 “derivative” models of R1, accumulating a total of 2.5 million downloads.
DeepSeek’s ability to hold its own against larger, established competitors has been characterized as “disrupting the AI landscape” and “over-hyped.” The company’s rise significantly contributed to an 18% drop in Nvidia’s stock price on Monday and drew a public acknowledgment from OpenAI’s CEO, Sam Altman.
Additionally, Microsoft has integrated DeepSeek into its Azure AI Foundry service, which unifies various AI offerings for enterprise use. During an earnings call to discuss Q1 results, Meta’s CEO Mark Zuckerberg remarked on how AI infrastructure spending would continue to provide a “strategic advantage” for the company.
As for what lies ahead for DeepSeek, it remains uncertain. Improved models are expected, but there are signs that the U.S. government is growing cautious about perceived foreign threats.
Subscribe to TechCrunch’s AI-focused newsletter here to receive updates every Wednesday.
This article was first published on January 28 and will be continuously updated with further information.
Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence


