Tuesday, 02 January 2024 12:17 GMT

Human-Tutored AI Learning Physical Sense


(MENAFN- The Arabian Post)

Nvidia has acknowledged that its AI models do not instinctively grasp common sense-such as knowing that birds cannot fly backwards or that ice melts into water-and has adopted an unusual remedy: assembling human“teachers” to devise pop‐quiz questions that train the AI to understand physical reality more reliably.

At the forefront of this initiative is Cosmos Reason, Nvidia's vision‐language model engineered to reason using physical common‐sense. Developed to improve models' understanding of the tangible world, it excels in fields such as robotics, autonomous vehicles and smart environments. Presently, Cosmos Reason leads the physical reasoning leaderboard on Hugging Face, underlining its advanced capabilities.

Central to this progress is Nvidia's data factory team, comprising analysts from disciplines including bioengineering, business and linguistics. They painstakingly craft thousands of question‐and‐answer pairs-derived from real‐world video footage, whether it's a person cutting spaghetti or cars navigating roads-and use these as tools to teach AI about everyday physical dynamics. Each question resembles an exam‐style multiple‐choice query, for instance“Which hand is being used to cut the spaghetti?”, with four possible answers provided for the model to choose from.

These Q&A pairs undergo rigorous quality assurance. Michelle Li, an analyst with a background in public health and data analytics, scrutinises each pair to ensure alignment with the project's physical‐AI objectives. After validation by team leads, the curated data is passed to the Cosmos Reason research scientists for model training using reinforcement learning.

The reasoning models trained through this method are capable of temporally grounded inference-for example, assessing what would occur if two cars were to drive toward each other in the same lane, and predicting the most probable outcome. According to Yin Cui, a research scientist involved with Cosmos Reason, embedding basic physical awareness is essential:“Without basic knowledge about the physical world, a robot may fall down or accidentally break something, causing danger to the surrounding people and environment.”.

See also Dubai prepares for launch of world's first commercial flying taxi service

By combining human‐curated training with reinforcement learning, Nvidia is enabling AI models to demonstrate reasoning more akin to human thought, particularly when interpreting dynamic, physical scenarios. Tsung‐Yi Lin, principal research scientist on Cosmos Reason, observes that the success of the data factory team's data production is pivotal for the development of autonomous systems capable of safe and intelligent interaction with the real world.

Notice an issue? Arabian Post strives to deliver the most accurate and reliable information to its readers. If you believe you have identified an error or inconsistency in this article, please don't hesitate to contact our editorial team at editor[at]thearabianpost[dot]com . We are committed to promptly addressing any concerns and ensuring the highest level of journalistic integrity.

MENAFN02092025000152002308ID1110007530

Legal Disclaimer:
MENAFN provides the information “as is” without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the provider above.

Search