Voice Changer

Transform any voice recording with a new voice using our voice conversion technology.

Transform voice recordings into high-quality, lifelike AI voices using Murf’s Voice Changer. With just a few parameters, you can change the speaker’s voice, adjust pitch and speed, insert custom pauses—all while preserving the original speaker’s rhythm, tone, and accent automatically.

Try this capability in the API Reference Playground. Simply generate your API key and start exploring the API.

Quickstart

You can Generate your API key from the Murf API Dashboard and optionally set it as an environment variable.

Install the SDK

If you’re using Python, you can install Murf’s Python SDK using the following command:

$pip install murf

Using the Voice Changer API

1from murf import Murf
2
3client = Murf(
4 api_key="YOUR_API_KEY", # Not required if you have set the MURF_API_KEY environment variable
5)
6
7file_path = "PATH_TO_YOUR_FILE" # Path to the file you want to use
8
9response = client.voice_changer.convert(
10 voice_id="en-US-terrell",
11 file=open(file_path, "rb"),
12 # file_url="URL_TO_YOUR_FILE", # Optional: Use `file_url` instead of `file` if you want to use a publicly accessible file
13)
14
15print(response.audio_file)

A link to the audio file will be returned in the response. You can use this link to download the audio file and use it wherever you need it. The audio file will be available for download for 24 hours after generation.

Response
1{
2 "audio_file": "https://murf.ai/link/to/audio/file",
3 "audio_length_in_seconds": 8.75,
4 "remaining_character_count": 992150,
5 "encoded_audio": "UklGRlpHSEFWA...",
6 "transcription": "The quick brown fox jumps over the lazy dog."
7}

Speech Customization

The Voice Changer endpoint offers powerful speech transformation capabilities, supporting key features like voiceId, pitch, speed, and pauses.

The maximum allowed input length is 3 minutes per request.

FAQ

The API accepts the following input audio formats: WAV, MP3, ALAW, ULAW, FLAC.

Our system supports the following output formats: WAV (Default), MP3, FLAC, ALAW, and ULAW. The Voice Changer endpoint offers the same range of sample rates and channel types as the Speech Synthesis endpoint, allowing users to optimize output quality based on their specific needs.

The maximum input length is 3 minutes. Files longer than this will be rejected.

Yes. The system automatically retains the original speaker’s prosody and accent—meaning their rhythm, tone, pacing, and regional speech patterns are preserved in the transformed voice for natural-sounding results. These are always enabled by default and do not need to be configured manually.