How DeepSeek’s AI Training in Chinese Gave it an Edge

Artificial Intelligence (AI) model DeepSeek has made headlines for its remarkable capabilities, but experts say the key to its success lies in its training data on Chinese language and culture. The assumption is that this unique training set improved DeepSeek’s logical abilities, allowing it to tackle complex concepts with greater ease.

According to Xiang Ligang, a telecommunications industry analyst, Chinese characters are highly efficient at conveying meaning, making them an ideal choice for AI models. “Chinese characters achieve maximum information transmission with minimal cost,” he said. “This has greatly improved efficiency and reduced costs in the processing of artificial intelligence.”

Others suggest that DeepSeek’s training data also includes multimedia content, such as traditional poetry paired with paintings or music. This multimodal learning material allegedly provided DeepSeek with rich and diverse knowledge.

While the specifics of DeepSeek’s training data remain undisclosed, experts agree that high-quality data is crucial for AI models. Yale University assistant professor Yang Zhuoran emphasized the importance of data quality in a report from DeepTech. “Data quality impacts not only a model’s ability to acquire and express knowledge but also its style and accuracy,” he said.

DeepSeek’s unique training on Chinese language and culture may be the secret behind its remarkable capabilities, making it a notable example of AI entering the “era of Chinese.”

Source: https://www.scmp.com/news/china/science/article/3298555/strokes-genius-why-deepseeks-ai-edge-may-come-its-chinese-lessons