WebSocket Streaming for TTS API

We are excited to launch WebSocket streaming for our Text-to-Speech API. This feature enables developers to build real-time, low-latency voice experiences by streaming text and receiving synthesized audio over a persistent, bidirectional connection. It is ideal for applications like conversational AI, live chat support, and dynamic content narration where immediate audio feedback is crucial.

Key features include:

  • Low-Latency Communication: Stream text and receive audio with minimal delay.
  • Bidirectional Streaming: Send text and receive audio on the same connection.
  • Efficient: Avoids the overhead of repeated HTTP requests for continuous audio synthesis.
  • Real-Time Control: Adjust voice style, speed, and pitch during the session.

Learn more in our WebSocket Streaming documentation.