Chinese AI developer DeepSeek has revealed that it spent just US$294,000 training its R1 reasoning-focused model, a figure far below estimates for comparable US rivals, according to a paper published in the academic journal Nature on Wednesday (September 17). Analysts note the disclosure could reignite debate over China’s standing in the global race to develop artificial intelligence.
The Hangzhou-based company said the R1 model was trained on a cluster of 512 Nvidia H800 chips over 80 hours, following preparatory work using A100 chips. This marks the first time DeepSeek has provided a cost estimate for R1, a model whose lower-cost approach previously spurred global investor concern that it could challenge established AI leaders such as Nvidia.
DeepSeek’s founder Liang Wenfeng co-authored the Nature article, which clarified earlier ambiguities surrounding the technology. US officials have previously confirmed that DeepSeek lawfully obtained H800 chips for development, and the company acknowledged in supplementary material that it also used A100 GPUs in early stages. Training costs for large-language models typically cover the running of powerful chip clusters over weeks or months to process vast amounts of data.
The company has also defended its use of “model distillation,” a technique in which one AI system learns from another, enabling lower-cost training while maintaining performance. DeepSeek said its V3 model incorporated publicly crawled web pages, including answers generated by OpenAI models, which may have indirectly influenced the base model’s knowledge. DeepSeek stressed that this was incidental rather than intentional.
The disclosure comes after DeepSeek largely retreated from public attention following a January announcement of its lower-cost AI systems, which had rattled technology investors worldwide. Since then, the company has released only a few product updates.
While OpenAI did not immediately comment, DeepSeek’s reported training costs are minuscule compared with statements from US AI executives, such as OpenAI CEO Sam Altman, who in 2023 said developing foundational models cost “much more” than US$100 million. The figures also underscore the broader debate over China’s potential to produce competitive AI at lower costs, using both domestic and lawfully acquired foreign hardware.
DeepSeek’s publication offers rare insight into a company that has attracted top Chinese AI talent partly due to access to A100 supercomputing clusters, a capability few domestic firms possess. Observers note that the firm’s ability to deliver sophisticated models with significantly lower training costs could influence investment and policy perspectives in the AI sector globally.
Reuters




