Voice Changer
Headers
Request
This parameter allows specifying the duration (in seconds) for the generated audio. If the value is 0, this parameter will be ignored. Only available for Gen2 model.
Valid values: STEREO, MONO
Format of the generated audio file. Valid values: MP3, WAV, FLAC, ALAW, ULAW
Specifies the language for the generated audio, enabling a voice to speak in multiple languages natively. Only available in the Gen2 model. Valid values: “en-US”, “en-UK”, “es-ES”, etc.
Use the GET /v1/speech/voices endpoint to retrieve the list of available voices and languages.
A JSON string that defines custom pronunciations for specific words or phrases. Each key is a word or phrase, and its value is an object with type
and pronunciation
.
Example 1: ’{“live”: {“type”: “IPA”, “pronunciation”: “laɪv”}}’
Example 2: ’{“2022”: {“type”: “SAY_AS”, “pronunciation”: “twenty twenty two”}}’
Indicates whether to retain the original prosody (intonation, rhythm, and stress) of the input voice in the generated output.
Use the GET /v1/speech/voices API to find supported voiceIds. You can use either the voiceId (e.g. en-US-natalie) or just the voice actor’s name (e.g. natalie).