Home AI - Artificial Intelligence DeepSeek: A Comprehensive Guide to the AI Chatbot Application

DeepSeek: A Comprehensive Guide to the AI Chatbot Application

by admin

DeepSeek has captured the spotlight.

This week, the Chinese AI laboratory DeepSeek made headlines as its chatbot application soared to the top of both the Apple App Store and Google Play Store charts. The company’s AI models, developed through compute-efficient methodologies, have prompted Wall Street analysts and tech experts to speculate about the sustainability of the U.S.’s dominance in the AI sector as well as the enduring demand for AI hardware.

But what are the origins of DeepSeek, and how did it achieve such rapid global recognition?

The Trader Origins of DeepSeek

DeepSeek is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund leveraging AI for its trading strategies.

AI enthusiast Liang Wenfeng co-founded High-Flyer in 2015. Having started his trading journey as a student at Zhejiang University, Wenfeng officially established High-Flyer Capital Management as a hedge fund in 2019, focusing on the development and application of AI algorithms.

In 2023, High-Flyer launched DeepSeek as a research lab dedicated to exploring AI tools independent of its financial operations. With High-Flyer as a primary investor, the lab transitioned into its own company, also named DeepSeek.

From its inception, DeepSeek developed its data center clusters for training models. However, like many AI companies in China, it has faced challenges due to U.S. export restrictions on hardware. To create one of its more recent models, the company resorted to using Nvidia H800 chips, which are less powerful than the H100 chips accessible to U.S. firms.

DeepSeek’s technical team is notable for its youthfulness, as the company actively recruits Ph.D. AI researchers from leading Chinese universities. It also hires individuals without a formal computer science background to enhance its technical understanding across diverse subjects, according to The New York Times.

DeepSeek’s Powerful Models

DeepSeek introduced its initial set of models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. However, it was the launch of the next-generation DeepSeek-V2 models in the spring that truly captured the AI industry’s attention.

DeepSeek-V2, a versatile system for text and image analysis, excelled in various AI benchmarks while being significantly more cost-effective than similar models at the time. This prompted DeepSeek’s competitors, including ByteDance and Alibaba, to reduce the pricing for some of their models and even offer others at no cost.

The debut of DeepSeek-V3 in December 2024 further solidified the company’s growing reputation.

Internal benchmarks suggest that DeepSeek V3 outshines both free downloadable models like Meta’s Llama and so-called “closed” models available solely through API access, such as OpenAI’s GPT-4o.

Moreover, the DeepSeek R1 “reasoning” model, launched in January, reportedly matches OpenAI’s o1 model on significant benchmarks.

As a reasoning model, R1 can effectively fact-check its own answers, mitigating some common issues encountered by other models. While reasoning models typically require more time — taking several seconds to minutes longer — they are generally more reliable in fields such as physics, science, and mathematics.

However, there are limitations to R1, DeepSeek V3, and other models from DeepSeek. Being Chinese-developed AI, they are subjected to regulatory benchmarks enforced by China’s internet authority to ensure that their responses align with “core socialist values.” For instance, R1 refrains from addressing queries related to Tiananmen Square or Taiwan’s independence in its chatbot application.

An Innovative Strategy

DeepSeek’s business model remains somewhat ambiguous. The company offers its products and services at prices significantly lower than market rates and even provides some for free.

According to DeepSeek, innovations in efficiency have allowed it to achieve unparalleled cost competitiveness. However, some experts challenge the accuracy of the company’s reported figures.

Regardless of the specifics, developers have embraced DeepSeek’s models, which, while not open-source in the traditional sense, are provided under permissive licenses that facilitate commercial use. Clem Delangue, CEO of Hugging Face, a platform hosting DeepSeek’s models, states that more than 500 “derivative” models of R1 have been created by developers on Hugging Face, collectively amassing 2.5 million downloads.

DeepSeek’s unexpected triumphs against larger, more seasoned competitors have been characterized as “disrupting AI” and “over-hyped.” The company’s success has partly contributed to an 18% drop in Nvidia’s stock price as of Monday and has prompted a public reaction from OpenAI’s CEO Sam Altman.

Microsoft has announced that DeepSeek will be featured on its Azure AI Foundry service, a hub for enterprise-level AI solutions. During the first-quarter earnings call, when questioned about DeepSeek’s influence on Meta’s AI investments, CEO Mark Zuckerberg noted that expenditures on AI infrastructure will continue to be a “strategic advantage” for Meta.

Conversely, several companies and entire nations have implemented bans on DeepSeek. New York state has also prohibited the use of DeepSeek on government devices.

The future of DeepSeek remains uncertain. Enhanced models are expected, but the U.S. government seems to be growing cautious of perceived foreign threats.

TechCrunch offers an AI-focused newsletter! Subscribe here to receive it directly in your inbox every Wednesday.

This article was initially published on January 28, 2025, and will be updated regularly with additional insights.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles