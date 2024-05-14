(MENAFN- The Rio Times) On Monday, OpenAI unveiled its latest model, the GPT-4o, featuring significant advancements in voice and image interaction.



This model allows real-time voice and video chats, enhancing user experiences.



GPT-4o builds on the existing GPT-4 framework, handling voice conversations with an average response time of 320 milliseconds.



The system uses Whisper, OpenAI's open-source speech recognition technology, to convert spoken words into text accurately.



This capability supports seamless conversations, real-time translation, and even requests for bedtime stories.







Beyond voice interactions, GPT-4o understands and analyzes images. Users can present multiple images to the chatbot to troubleshoot gadgets, suggest meals, or evaluate complex graphs.



The multimodal capabilities combine language reasoning with visual data processing, leveraging the advanced GPT-3.5 and GPT-4 models.



This release comes amid fierce competition in the AI tech race. Google plans to announce a new virtual voice assistant soon, and Apple will debut a more conversational Siri in June.



These developments highlight the evolving landscape of AI-driven voice technology.

Advancing AI Capabilities Safely

OpenAI's phased rollout aims to refine the technology and ensure safety. This approach addresses risks like synthetic voice misuse for impersonation or fraud.



Spotify is piloting OpenAI's voice translation to translate podcasts into other languages while retaining the original voices.



Despite significant advancements, OpenAI remains transparent about the model's limitations.



While GPT-4o excels at transcribing English, it struggles with non-Roman scripts. OpenAI advises against high-risk applications without proper verification.



OpenAI will extend voice and image functionalities to Plus and Enterprise users within weeks, with broader availability to follow.



This rollout aims to prepare users for more advanced AI systems, ensuring safe and effective use.



Overall, GPT-4o represents a significant leap in AI capabilities, offering enhanced interactions through voice and image understanding.



As the AI landscape evolves, these innovations will shape the future of human-computer interactions.

MENAFN14052024007421016031ID1108210902