OpenAI has unveiled its latest generative AI model, GPT-4o. This new model enhances the capabilities of its predecessor, GPT-4, by providing real-time responses across audio, text, and vision. It can detect and convey emotions, similar to the startup Hume.
GPT-4o is 2x faster, 50% cheaper, and has a 5x higher rate limit than GPT-4 Turbo. It will be accessible to free users and via the API. OpenAI CTO, Muri Murati, stated that GPT-4o offers "GPT-4-level" intelligence but with improved performance across text, vision, and audio.
Unlike GPT-4, which was trained on a combination of images and text, GPT-4o incorporates speech. It can extract text from images, describe image content, and even generate emotive voices.
In addition to GPT-4o, OpenAI is also launching a desktop version of ChatGPT and a refreshed UI. The aim is to make the interaction experience more natural and easy, allowing users to focus on collaboration with the GPT models.
OpenAI's ChatGPT is already being used by more than 100 million people, and over 1 million custom GPTs have been created by users in the GPT Store. In the coming weeks, the new GPT-4o model will be rolled out iteratively across OpenAI's products.