OpenAI Tackles “Hallucinations” in Advanced Models
(MENAFN) The company behind ChatGPT has addressed the ongoing challenge of Artificial Intelligence models producing believable but false statements, which it refers to as “hallucinations.”
In a statement released on Friday, OpenAI explained that AI models are usually incentivized to make a guess, no matter how unlikely, instead of admitting when they cannot provide an answer.
The problem stems from the fundamental principles underlying “standard training and evaluation procedures,” the statement added.
OpenAI acknowledged that situations in which language models “confidently generate an answer that isn’t true” continue to affect newer, more sophisticated versions, including its latest flagship system, GPT-5.
A recent study highlighted that the issue originates from the conventional methods used to assess the performance of language models, which favor models that guess over those that cautiously indicate uncertainty.
Under these standard protocols, AI systems learn that failing to produce an answer results in zero points on a test, whereas an unverified guess could occasionally be rewarded.
“Fixing scoreboards can broaden adoption of hallucination-reduction techniques,” the statement concluded, while also noting that “accuracy will never reach 100% because, regardless of model size, search and reasoning capabilities, some real-world questions are inherently unanswerable.”
In a statement released on Friday, OpenAI explained that AI models are usually incentivized to make a guess, no matter how unlikely, instead of admitting when they cannot provide an answer.
The problem stems from the fundamental principles underlying “standard training and evaluation procedures,” the statement added.
OpenAI acknowledged that situations in which language models “confidently generate an answer that isn’t true” continue to affect newer, more sophisticated versions, including its latest flagship system, GPT-5.
A recent study highlighted that the issue originates from the conventional methods used to assess the performance of language models, which favor models that guess over those that cautiously indicate uncertainty.
Under these standard protocols, AI systems learn that failing to produce an answer results in zero points on a test, whereas an unverified guess could occasionally be rewarded.
“Fixing scoreboards can broaden adoption of hallucination-reduction techniques,” the statement concluded, while also noting that “accuracy will never reach 100% because, regardless of model size, search and reasoning capabilities, some real-world questions are inherently unanswerable.”

Legal Disclaimer:
MENAFN provides the
information “as is” without warranty of any kind. We do not accept
any responsibility or liability for the accuracy, content, images,
videos, licenses, completeness, legality, or reliability of the information
contained in this article. If you have any complaints or copyright
issues related to this article, kindly contact the provider above.
Most popular stories
Market Research

- Japan Buy Now Pay Later Market Size To Surpass USD 145.5 Billion By 2033 CAGR Of 22.23%
- BTCC Summer Festival 2025 Unites Japan's Web3 Community
- GCL Subsidiary, 2Game Digital, Partners With Kucoin Pay To Accept Secure Crypto Payments In Real Time
- Smart Indoor Gardens Market Growth: Size, Trends, And Forecast 20252033
- Nutritional Bar Market Size To Expand At A CAGR Of 3.5% During 2025-2033
- Pluscapital Advisor Empowers Traders To Master Global Markets Around The Clock
Comments
No comment