cURL

curl --request POST \
  --url 'https://api.typecast.ai/v1/voices/clone' \
  --header 'X-API-KEY: <api-key>' \
  -F 'file=@sample.wav' \
  -F 'name=my-voice' \
  -F 'model=ssfm-v30'

{
  "voice_id": "uc_64a1b2c3d4e5f6a7b8c9d0e1",
  "name": "my-voice",
  "model": "ssfm-v30"
}

Voices

Instant cloning

Clone a custom voice from a short audio sample and use it like any built-in voice in subsequent text-to-speech calls.

Upload a WAV or MP3 file (max 25 MB). The server extracts a speaker embedding and returns a custom voice ID with the uc_ prefix that can be passed directly to POST /v1/text-to-speech (and any other endpoint that accepts a voice_id). The original audio is uploaded to S3 in the background after the response is returned.

Limits

Audio file: max 25 MB. Supported formats: WAV, MP3.
Voice name: 1-30 characters.
Model: ssfm-v21 or ssfm-v30. The cloned voice is bound to this engine model.
Each plan has a maximum number of active custom voices (the custom_voice_slot). Use DELETE /v1/voices/{voice_id} to free a slot.

Typical flow

POST /v1/voices/clone with the sample audio → receive voice_id (e.g. uc_64a1b2...).
POST /v1/text-to-speech with voice_id set to the cloned ID.
DELETE /v1/voices/{voice_id} when you no longer need the voice.

POST

/

v1

/

voices

/

clone

cURL

curl --request POST \
  --url 'https://api.typecast.ai/v1/voices/clone' \
  --header 'X-API-KEY: <api-key>' \
  -F 'file=@sample.wav' \
  -F 'name=my-voice' \
  -F 'model=ssfm-v30'

{
  "voice_id": "uc_64a1b2c3d4e5f6a7b8c9d0e1",
  "name": "my-voice",
  "model": "ssfm-v30"
}

Authorizations

X-API-KEY

string

header

required

API key for authentication. You can obtain an API key from the Typecast API Console.

Body

multipart/form-data

Multipart request body for instant cloning.

file

file

required

Audio sample. WAV or MP3, max 25 MB.

name

string

required

Voice name (1-30 characters).

Required string length: 1 - 30

model

enum<string>

required

Engine model to clone the voice for.

Available options:

ssfm-v21,

ssfm-v30

Response

Successful Response - Custom voice created

Response of POST /v1/voices/clone — custom voice metadata returned by instant cloning.

voice_id

string

required

Custom voice identifier with the uc_ prefix. Use this value as voice_id in POST /v1/text-to-speech and other endpoints that accept voice_id.

Example:

"uc_64a1b2c3d4e5f6a7b8c9d0e1"

name

string

required

Human-readable voice name (1-30 characters).

model

enum<string>

required

Engine model the voice was cloned for (ssfm-v21 or ssfm-v30).

Available options:

ssfm-v30,

ssfm-v21

Streaming Text To Speech Delete Custom Voice

⌘I