Synthesize Speech | Murf API

Returns a url to the generated audio file along with other associated properties.

Headers

api-keystringOptional

Request

This endpoint expects an object.

textstringRequired

The text that is to be synthesised. e.g. ‘Hello there [pause 1s] friend’

voiceIdstringRequired

Use the GET /v1/speech/voices API to find supported voiceIds. You can use either the voiceId (e.g. en-US-natalie) or just the voice actor’s name (e.g. natalie).

audioDurationdoubleOptional>=0

This parameter allows specifying the duration (in seconds) for the generated audio. If the value is 0, this parameter will be ignored. Only available for Gen2 model.

channelTypestringOptionalDefaults to MONO

Valid values: STEREO, MONO

encodeAsBase64booleanOptional

Set to true to receive audio in response as a Base64 encoded string instead of a url. This enables zero retention of audio data on Murf's servers.

formatstringOptionalDefaults to WAV

Format of the generated audio file. Valid values: MP3, WAV, FLAC, ALAW, ULAW, PCM, OGG

modelVersionenumOptionalDefaults to GEN2

Valid values: GEN2. Audio will be generated using the new and advanced GEN2 model. Outputs from GEN2 sound more natural and high-quality compared to earlier models.

Allowed values:

multiNativeLocalestringOptional

Specifies the language for the generated audio, enabling a voice to speak in multiple languages natively. Only available in the Gen2 model. Valid values: “en-US”, “en-UK”, “es-ES”, etc. Use the GET /v1/speech/voices endpoint to retrieve the list of available voices and languages.

pitchintegerOptional>=-50<=50

Pitch of the voiceover

pronunciationDictionarymap from strings to objectsOptional

An object used to define custom pronunciations.

Example 1: {“live”:{“type”: “IPA”, “pronunciation”: “laɪv”}}.

Example 2: {“2022”:{“type”: “SAY_AS”, “pronunciation”: “twenty twenty two”}}

rateintegerOptional>=-50<=50

Speed of the voiceover

sampleRatedoubleOptionalDefaults to 44100

Valid values are 8000, 24000, 44100, 48000

stylestringOptional

The voice style to be used for voiceover generation.

variationintegerOptional>=0<=5Defaults to 1

Higher values will add more variation in terms of Pause, Pitch, and Speed to the voice. Only available for Gen2 model.

wordDurationsAsOriginalTextbooleanOptionalDefaults to false

If set to true, the word durations in response will return words as the original input text. (English only)

Response

audioFilestringformat: "url"

audioLengthInSecondsdouble

consumedCharacterCountlong

Number of characters consumed so far in the current billing cycle.

remainingCharacterCountlong

Remaining number of characters available for synthesis in the current billing cycle.

wordDurationslist of objects

encodedAudiostring or null

warningstring or null

1	from murf import Murf, PronunciationDetail
2
3	client = Murf(
4	api_key="YOUR_API_KEY",
5	)
6	client.text_to_speech.generate(
7	pronunciation_dictionary={
8	"2010": PronunciationDetail(
9	pronunciation="two thousand and ten",
10	type="SAY_AS",
11	),
12	"live": PronunciationDetail(
13	pronunciation="laɪv",
14	type="IPA",
15	),
16	},
17	text="The 2010 world cup was held in South Africa",
18	voice_id="en-US-natalie",
19	)

1	{
2	"audioFile": "string",
3	"audioLengthInSeconds": 1.1,
4	"consumedCharacterCount": 1,
5	"remainingCharacterCount": 1,
6	"wordDurations": [
7	{
8	"endMs": 1,
9	"startMs": 1,
10	"word": "string",
11	"pitchScaleMaximum": 1.1,
12	"pitchScaleMinimum": 1.1,
13	"sourceWordIndex": 1
14	}
15	],
16	"encodedAudio": "string",
17	"warning": "string"
18	}

Headers

Request

Response

Errors