How DeepSeek Cracked the Code on Cheap AI

A Chinese startup called DeepSeek has successfully built one of the world’s most powerful artificial intelligence systems using significantly fewer computer chips than experts thought possible. By leveraging a technique called “mixture of experts” and optimizing their math calculations, DeepSeek reduced its computing power costs by about 90%. This achievement challenges the conventional wisdom that building advanced AI requires massive amounts of raw computing power.

The current state of AI development relies heavily on neural networks, which learn patterns in vast amounts of data. However, this approach is expensive due to the need for specialized computer chips and electricity-intensive data analysis. To overcome these challenges, DeepSeek employed a multi-step strategy.

Firstly, they divided their system into smaller “expert” neural networks, each focused on a specific domain such as poetry or physics. By pairing these experts with a generalist system, they were able to facilitate information exchange while maintaining efficiency.

Furthermore, DeepSeek applied a mathematical trick involving decimals, similar to how pi is used in elementary school math. They squeezed the numbers into 8-bit memory spaces, reducing computational power requirements, and then stretched the answers across 32-bit memory spaces for increased precision.

This approach allowed DeepSeek to train their AI system with only $6 million in raw computing power, a fraction of what Meta spent on similar technology. The startup’s innovative methods are now being shared with other A.I. researchers, poised to significantly reduce the cost of building advanced AI systems.

Source: https://www.nytimes.com/2025/02/12/technology/deepseek-ai-chip-costs.html