A small Chinese AI start-up, DeepSeek, has made headlines by unveiling its R1 model, which details how to create a large language model on a limited budget that can learn autonomously. Founded by hedge fund manager Liang Wenfeng, DeepSeek's release has ignited discussions in Silicon Valley regarding the competitive edge of US AI firms like OpenAI and Google DeepMind, which have kept their methodologies secret.
Liang has gained national recognition for his achievements, recently attending a high-profile meeting with China's second-most powerful leader, Li Qiang, where entrepreneurs were encouraged to advance core technologies. Liang's journey began in 2021 when he began acquiring Nvidia GPUs for his AI project while managing his trading fund, High-Flyer. Initially dismissed as eccentric, his vision has now materialized into a significant AI venture.
DeepSeek's strength lies in its innovative approach to maximizing the potential of limited computing resources, especially after the US imposed restrictions on Nvidia chip exports to China. The company focuses on research rather than commercial gains, operating similarly to early DeepMind, and has not sought external funding. Liang compensates his team with top salaries, attracting talent from prestigious Chinese universities.
DeepSeek claims to have trained its model using only 2,048 Nvidia H800s and $5.6 million, achieving 671 billion parameters, significantly less than what US companies have spent. Experts note that while DeepSeek has demonstrated impressive capabilities, its future competitiveness remains uncertain as the AI landscape evolves. US rivals are investing heavily in advanced computing infrastructure, potentially widening the performance gap.
Despite its current success, DeepSeek's sustainability may be challenged as industry dynamics shift, with competitors like OpenAI and Elon Musk's xAI expanding their resources significantly.