403
Sorry!!
Error! We're sorry, but the page you were looking for doesn't exist.
DeepSeek Reveals Technique to Enhance Large Language Models' Reasoning
(MENAFN) Chinese AI start-up DeepSeek has unveiled a novel technique aimed at improving the reasoning capabilities of large language models (LLMs), reportedly outperforming current methods.
DeepSeek, in collaboration with researchers from Tsinghua University, developed a dual-method approach that merges generative reward modeling (GRM) with self-principled critique tuning, as reported by a Chinese news agency on Sunday.
This innovative combination is designed to enable LLMs to generate more accurate and quicker responses to general queries, as detailed in a paper published on Friday.
According to the researchers, the new DeepSeek-GRM models exceeded the performance of existing methods, achieving “competitive performance” when compared to robust public reward models. Reward modeling is a technique used to align an LLM’s behavior with human preferences.
DeepSeek has plans to release its GRM models as open source, although a specific release date has not been disclosed.
The research paper, published on the arXiv online scientific repository, has garnered increased attention toward the company's future innovations, particularly after the global spotlight on its V3 foundation model and R1 reasoning model.
DeepSeek, in collaboration with researchers from Tsinghua University, developed a dual-method approach that merges generative reward modeling (GRM) with self-principled critique tuning, as reported by a Chinese news agency on Sunday.
This innovative combination is designed to enable LLMs to generate more accurate and quicker responses to general queries, as detailed in a paper published on Friday.
According to the researchers, the new DeepSeek-GRM models exceeded the performance of existing methods, achieving “competitive performance” when compared to robust public reward models. Reward modeling is a technique used to align an LLM’s behavior with human preferences.
DeepSeek has plans to release its GRM models as open source, although a specific release date has not been disclosed.
The research paper, published on the arXiv online scientific repository, has garnered increased attention toward the company's future innovations, particularly after the global spotlight on its V3 foundation model and R1 reasoning model.
Legal Disclaimer:
MENAFN provides the
information “as is” without warranty of any kind. We do not accept
any responsibility or liability for the accuracy, content, images,
videos, licenses, completeness, legality, or reliability of the information
contained in this article. If you have any complaints or copyright
issues related to this article, kindly contact the provider above.

Comments
No comment