Sam Altman-run OpenAI has introduced advanced speech-to-text and text-to-speech models in its API to empower developers to create sophisticated and customisable voice agents. These speech-to-text models include gpt-4o-transcribe and gpt-4o-mini-transcribe to improve transcription accuracy and language recognition compared to the previous Whisper models. Additionally, the launch of the gpt-4o-mini-tts model introduces enhanced steerability. It will allow developers to instruct the model to offer customised experiences for use cases. Elon Musk and Sam Altman-Run OpenAI Agree To Expedite Trial Over For-Profit Shift.
OpenAI Audio Models in API
Three new state-of-the-art audio models in the API:
🗣️ Two speech-to-text models—outperforming Whisper
💬 A new TTS model—you can instruct it *how* to speak
🤖 And the Agents SDK now supports audio, making it easy to build voice agents.
Try TTS now at https://t.co/MbTOlNYyca.
— OpenAI Developers (@OpenAIDevs) March 20, 2025
(SocialLY brings you all the latest breaking news, viral trends and information from social media world, including Twitter (X), Instagram and Youtube. The above post is embeded directly from the user's social media account and LatestLY Staff may not have modified or edited the content body. The views and facts appearing in the social media post do not reflect the opinions of LatestLY, also LatestLY does not assume any responsibility or liability for the same.)