A recent study published in Nature reveals that DeepSeek’s powerful AI model R1 did not rely on training data from its rivals to achieve success, contrary to initial reports. The Chinese firm has confirmed this through documents released alongside the peer-reviewed version of R1, which is now available for download.
R1 excels at “reasoning” tasks such as mathematics and coding, thanks to its unique training method using pure reinforcement learning. This approach rewards the model for reaching correct answers, allowing it to learn its own reasoning-like strategies without human-selected examples.
The model’s developers claim that this technique is cheaper than traditional methods used by rival firms, with a total cost of $294,000 compared to tens of millions of dollars for other models. R1 has been downloaded 10.9 million times on the Hugging Face platform and is considered one of the most popular open-weight AI models.
A rigorous peer-review process has verified the validity and usefulness of R1, with experts praising its efficiency and competitiveness in scientific tasks. The model’s impact is being felt across the AI community, with researchers now attempting to apply its methods to improve existing LLMs and extend them to new domains.
Source: https://www.scientificamerican.com/article/secrets-of-chinese-ai-model-deepseek-revealed-in-landmark-paper