Clone a custom voice from a short audio sample and use it like any built-in voice in subsequent text-to-speech calls.
Upload a WAV or MP3 file (max 25 MB). The server extracts a speaker embedding and returns a custom voice ID with the uc_ prefix that can be passed directly to POST /v1/text-to-speech (and any other endpoint that accepts a voice_id). The original audio is uploaded to S3 in the background after the response is returned.
Limits
ssfm-v21 or ssfm-v30. The cloned voice is bound to this engine model.custom_voice_slot). Use DELETE /v1/voices/{voice_id} to free a slot.Typical flow
POST /v1/voices/clone with the sample audio → receive voice_id (e.g. uc_64a1b2...).POST /v1/text-to-speech with voice_id set to the cloned ID.DELETE /v1/voices/{voice_id} when you no longer need the voice.Documentation Index
Fetch the complete documentation index at: https://typecast.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
API key for authentication. You can obtain an API key from the Typecast API Console.
Successful Response - Custom voice created
Response of POST /v1/voices/clone — custom voice metadata returned by instant cloning.
Custom voice identifier with the uc_ prefix. Use this value as voice_id in POST /v1/text-to-speech and other endpoints that accept voice_id.
"uc_64a1b2c3d4e5f6a7b8c9d0e1"
Human-readable voice name (1-30 characters).
Engine model the voice was cloned for (ssfm-v21 or ssfm-v30).
ssfm-v30, ssfm-v21