DeepSeek reveals novel AI technique claiming superior reasoning for language models

2025-04-07 01:53:44

(MENAFN) Chinese AI start-up DeepSeek has unveiled an innovative approach aimed at enhancing the reasoning capabilities of large language models (LLMs), reportedly exceeding existing methods.

In collaboration with researchers from Tsinghua University, DeepSeek has created a dual technique that integrates generative reward modeling (GRM) with self-principled critique tuning, as reported by the local media on Sunday.

This new approach is intended to enable LLMs to deliver more precise and quicker responses to general inquiries, according to a paper released on Friday.

The researchers noted that the DeepSeek-GRM models demonstrated superior performance compared to current techniques, reaching "competitive performance" alongside established public reward models. Reward modeling is a method used to align the behavior of LLMs with human preferences.

DeepSeek intends to release its GRM models as open source, although a specific timeline for this initiative has not been disclosed.

The paper, which was published on the online scientific repository arXiv, has sparked increased interest in the firm’s future projects, particularly following the global attention garnered by its V3 foundation model as well as R1 reasoning model.

MENAFN07042025000045017169ID1109396803

Legal Disclaimer:
MENAFN provides the information “as is” without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the provider above.