OpenAI Releases Faster, Multimodal GPT-4o, Free for All ChatGPT Users

3 min readMay 14, 2024

OpenAI has announced the launch of GPT-4o, an iteration of the GPT-4 model that powers its hallmark product, ChatGPT. In a livestream announcement on Monday, OpenAI CTO Mira Murati highlighted that GPT-4o is “much faster” and improves capabilities across text, vision, and audio. This new model will be available for free to all ChatGPT users, while paid users will continue to enjoy up to five times the capacity limits of free users.

Enhanced Multimodal Capabilities

GPT-4o stands out for its ability to process and generate text, audio, and image outputs, enabling more natural and versatile human-computer interactions. It can respond to audio inputs in real time, with latencies as low as 232 milliseconds, similar to human response times. This iteration represents a significant step forward in multimodal AI, offering a seamless integration of various input and output formats processed by a single neural network.

Performance and Efficiency Improvements

OpenAI claims that GPT-4o matches the performance of GPT-4 Turbo in text and coding tasks, with significant enhancements in non-English language processing, vision, and audio tasks. The new model is also twice as fast and half the price of GPT-4 Turbo in the API, with 5x higher rate limits, making it more accessible for developers and businesses.

New Features in ChatGPT

GPT-4o introduces new features to ChatGPT’s voice mode, transforming it into a more interactive and context-aware voice assistant. Unlike the current voice mode, which handles one prompt at a time, the updated version will be capable of real-time responses and a better understanding of ongoing interactions. This aligns with the model’s multimodal capabilities, allowing it to observe and interact with the world around users more effectively.

Developer Access and API

Developers interested in exploring GPT-4o will have access to the API, which is both more affordable and efficient compared to its predecessors. OpenAI CEO Sam Altman announced on X that the API for GPT-4o is available at half the price and twice the speed of GPT-4 Turbo, providing more opportunities for innovation and application development.

Commitment to Safety and Accessibility

OpenAI has emphasized the safety and ethical considerations in deploying GPT-4o. The model has undergone extensive internal and external evaluations to identify and mitigate risks, especially those associated with its audio capabilities. Initially, only text and image functionalities are being rolled out, with audio capabilities to follow after further safety and usability enhancements.

Future Prospects

In his blog post, Altman reflected on OpenAI’s evolving vision, shifting from creating open-source AI models to providing robust, accessible tools for developers to build upon. This strategic shift aims to empower third parties to leverage AI for creating beneficial applications, thus amplifying the positive impact of AI on society.

Launch Timing and Industry Context

The announcement of GPT-4o comes just ahead of Google I/O, where the tech giant is expected to unveil its latest AI advancements. This timing highlights OpenAI’s competitive edge and commitment to leading in the AI space.

Conclusion

GPT-4o represents a significant advancement in AI technology, combining speed, multimodal capabilities, and cost-efficiency. By making this model available for free to all ChatGPT users and providing robust API access for developers, OpenAI is setting new standards for AI accessibility and usability.

For more information and to start using GPT-4o, visit the OpenAI website and explore the new capabilities in ChatGPT: https://openai.com/index/hello-gpt-4o/