DeepSeek-R1 Training Method Released

Breaking-News >> WorldNews

Scientific and technological journalist.

The DeepSeek-AI team, Liang Veng and his colleagues, published a large-scale reasoning model training method for the open-source AI (AI) model, DeepSeek-R1, in the journal Nature on the 17th. Research shows that the reasoning capabilities of the big language model (LLM) can be enhanced by reinforced learning, thereby reducing the amount of human input work required to enhance performance. The trained models perform better on tasks such as mathematics, programming competitions and postgraduate level issues in STEM fields than the LLM of traditional training.

DeepSeek-R1 contains an in-depth training phase under human supervision to optimize the inference process. Liang Wenfeng's team reported that the model uses reinforcement learning instead of human examples to develop inference steps, reducing training costs and complexity. After DeepSeek-R1 is shown with high-quality problem-solving cases, it will get a template to generate the reasoning process, that is, this model is rewarded by solving problems, thereby strengthening the learning effect. The team concluded that future research can focus on optimizing the reward process to ensure more reliable reasoning and task results.

In the mathematical benchmark test to evaluate AI performance, DeepSeek-R1-Zero and DeepSeek-R1 scored 77.9% and 79.8% respectively, and also performed well in programming competitions and graduate-level biology, physics and chemistry problems.

News raw data sources → https://world.huanqiu.com/article/4ON8nkngj9T

17WorldNews[2025.09.18-12:06] 访问：51

[关闭窗口]

「Links」

...

Search on site

This day in history

August 2023

Sun

Mon

Tue

Wed

Thu

Fri

Sat