OpenAI is releasing a new flagship generative AI model called GPT-4o, which will be "iteratively" deployed into the company's developer and consumer products over the next few weeks.
OpenAI CTO Muri Murati said GPT-4o provides "GPT-4-level" intelligence, but improves on GPT-4's text, vision and audio capabilities.
"GPT-4o is voice-, text- and vision-aware," Murati said at a keynote presentation at OpenAI's office.
GPT-4, the previous leading OpenAI model, trained on a combination of images and text and could analyze images and text, performing tasks such as extracting text from images or even describing their content. But GPT-4o adds speech to this as well.
GPT-4o significantly improves the performance of ChatGPT - ChatGPT is an OpenAI viral chatbot powered by artificial intelligence. ChatGPT has long offered a voice mode that transcribes text from ChatGPT using a text-to-speech model. GPT-4o enhances this mode by allowing users to interact with ChatGPT as an assistant.
For example, users can ask ChatGPT - based on GPT-4o - a question and interrupt ChatGPT during the answer. According to OpenAI, the model responds in "real time" and can even pick up emotions in the user's voice - and generate a voice in "different emotional styles."
In other news: OpenAI releases desktop version of ChatGPT with updated user interface.
"We know that these models [are] getting more and more complex, but we want the interaction experience to be more natural, easier and you don't have to pay attention to the UI at all and just focus on collaborating with [GPT]," Murati says.
Ailib neural network catalog. All information is taken from public sources.
Advertising and Placement: [email protected] or t.me/fozzepe