Home AI - Artificial Intelligence DeepSeek: A Comprehensive Guide to the AI Chatbot Application

DeepSeek: A Comprehensive Guide to the AI Chatbot Application

by admin

DeepSeek has surged to popularity.

This week, the Chinese AI company DeepSeek captured mainstream attention when its chatbot application climbed to the top of the Apple App Store and Google Play Store charts. The AI models developed by DeepSeek, created through compute-efficient methods, have sparked discussions among Wall Street analysts and technologists about whether the United States can retain its competitive edge in the AI sector and whether the demand for AI chips will persist.

But what is the origin of DeepSeek, and how did it achieve such rapid international fame?

The Trading Origins of DeepSeek

DeepSeek is funded by High-Flyer Capital Management, a Chinese hedge fund that employs AI to guide its trading strategies.

Liang Wenfeng, an AI enthusiast, co-founded High-Flyer in 2015. While still a student at Zhejiang University, Wenfeng ventured into trading, and in 2019, he established High-Flyer Capital Management as a hedge fund with a focus on developing and implementing AI algorithms.

In 2023, High-Flyer launched DeepSeek as a dedicated AI research lab, separating it from its financial operations. With High-Flyer among its investors, the lab evolved into its own entity, also named DeepSeek.

From the outset, DeepSeek invested in building its own data center clusters for training models. However, like many AI firms in China, DeepSeek has been impacted by U.S. export restrictions on hardware. When training one of its latest models, the company resorted to using the Nvidia H800 chips, a less powerful alternative to the H100 chip accessible to U.S. firms.

DeepSeek’s technical team is reportedly quite young. The company actively seeks talented AI researchers with doctoral degrees from leading Chinese universities. Furthermore, DeepSeek recruits individuals without computer science backgrounds to enhance its technical understanding across various subjects, as reported by The New York Times.

DeepSeek’s Robust Models

DeepSeek introduced its initial models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. However, it was the subsequent launch of the next-generation DeepSeek-V2 models in the following spring that garnered significant attention from the AI sector.

The DeepSeek-V2, designed for analyzing text and images, excelled in several AI benchmarks and proved to be considerably more cost-effective to operate than its contemporaries. This innovation compelled DeepSeek’s domestic competitors, such as ByteDance and Alibaba, to reduce charges for some of their services while making others entirely free.

The December 2024 introduction of DeepSeek-V3 further enhanced DeepSeek’s reputation.

According to internal benchmarks conducted by DeepSeek, the V3 model outshines both downloadable, publicly available models like Meta’s Llama and proprietary models requiring API access, such as OpenAI’s GPT-4o.

DeepSeek’s R1 reasoning model, launched in January, is also noteworthy. The company asserts that R1 competes on par with OpenAI’s o1 model in key benchmarks.

As a reasoning model, R1 effectively fact-checks its outputs, helping it sidestep typical errors that often afflict conventional models. These reasoning models tend to take a little longer to produce results — often seconds to minutes more — compared to standard models, but they generally demonstrate heightened reliability in complex fields like physics, science, and mathematics.

However, there are drawbacks to R1, DeepSeek V3, and other models from DeepSeek. Because they are developed in China, they face monitoring by China’s internet regulator to ensure compliance with “core socialist values.” For instance, in the chatbot app, R1 refrains from responding to inquiries regarding Tiananmen Square or Taiwan’s independence.

A Disruptive Strategy

Although it seems that DeepSeek operates with a business model, the specifics remain somewhat unclear. The company offers its products and services at significantly reduced prices and even provides some for free.

DeepSeek attributes its aggressive pricing to breakthroughs in efficiency, though some experts challenge the accuracy of the data the company has released.

Regardless, developers are embracing DeepSeek’s models, which, while not open-source in the traditional sense, are available under permissive licenses allowing for commercial exploitation. Clem Delangue, CEO of Hugging Face, noted that developers on the Hugging Face platform have produced over 500 “derivative” models of R1, collectively tallying 2.5 million downloads.

DeepSeek’s ability to compete with larger and more established players in the market has been described as transformative and “over-hyped.” The company’s success contributed to an 18% decline in Nvidia’s stock price on Monday, and prompted a public comment from OpenAI CEO Sam Altman.

Microsoft has announced that DeepSeek’s technology will be accessible via its Azure AI Foundry service, which consolidates AI services for enterprises. At a recent earnings call, when asked about the impact of DeepSeek on Meta’s AI investments, CEO Mark Zuckerberg emphasized that AI infrastructure spending will remain a “strategic advantage” for Meta.

Conversely, some firms are prohibiting the use of DeepSeek, including entire countries like South Korea. New York state has also issued a ban on utilizing DeepSeek on government devices.

As for DeepSeek’s future, it remains uncertain. While enhancements to models are expected, the U.S. government appears increasingly cautious regarding potential foreign influences deemed problematic.

TechCrunch offers an AI-centric newsletter! Subscribe here to receive it in your inbox every Wednesday.

This report was initially published on January 28, 2025, and will be updated with further developments.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles