GPT Realtime API Enhanced

GPT Realtime API Enhanced

31 August 2025

OpenAI has enhanced its voice AI capabilities with the upgraded Realtime API and the introduction of gpt-realtime. The Realtime API is now generally available, featuring support for remote Model Context Protocol (MCP) servers, image inputs, and Session Initiation Protocol (SIP) for phone calls. These updates enable developers and enterprises to create more reliable and versatile voice agents.

The new gpt-realtime model excels in understanding complex instructions, precise tool usage, and natural-sounding speech. It improves the interpretation of system messages and developer prompts, enabling seamless language switching and accurate alphanumeric sequence detection. Image input support allows the model to ground conversations in visual context, answering questions about displayed images or screenshots. The Realtime API processes audio directly, reducing latency and preserving speech nuances.

GPT-realtime is priced at $32 per 1 million audio input tokens and $64 per 1 million audio output tokens, with a 20% price reduction compared to its predecessor. New features also provide fine-grained control over conversation context, enabling intelligent token limits and cost reduction for extended sessions.

AI generated content may differ from the original.

Published on 31 August 2025
aigptvoiceopenaigptrealtimeapi
  • OpenAI's GPT-Realtime Debuts

    OpenAI's GPT-Realtime Debuts

    Read more about OpenAI's GPT-Realtime Debuts
  • AI-Powered Ransomware Emerges

    AI-Powered Ransomware Emerges

    Read more about AI-Powered Ransomware Emerges
  • DeepSeek releases V3.1 model

    DeepSeek releases V3.1 model

    Read more about DeepSeek releases V3.1 model
  • Altman Acknowledges AI Market Bubble

    Altman Acknowledges AI Market Bubble

    Read more about Altman Acknowledges AI Market Bubble