Skip to main content

Package

Typecast Python SDK

Source Code

Typecast Python SDK Source Code

Installation

Install the Typecast Python SDK using pip:
pip install --upgrade typecast-python
The package is installed as typecast-python, but imported as typecast.
Make sure you have version 0.1.5 or higher installed. You can check your version with pip show typecast-python. If you have an older version, run pip install --upgrade typecast-python to update.

Quick Start

Here’s a simple example to convert text to speech:
from typecast import Typecast
from typecast.models import TTSRequest

# Initialize client
client = Typecast(api_key="YOUR_API_KEY")

# Convert text to speech
response = client.text_to_speech(TTSRequest(
    text="Hello there! I'm your friendly text-to-speech agent.",
    model="ssfm-v30",
    voice_id="tc_672c5f5ce59fac2a48faeaee"
))

# Save audio file
with open('output.wav', 'wb') as f:
    f.write(response.audio_data)

print(f"Duration: {response.duration}s, Format: {response.format}")

Features

The Typecast Python SDK provides powerful features for text-to-speech conversion:
  • Multiple Voice Models: Support for ssfm-v30 (latest) and ssfm-v21 AI voice models
  • Multi-language Support: 37 languages including English, Korean, Spanish, Japanese, Chinese, and more
  • Emotion Control: Preset emotions (normal, happy, sad, angry, whisper, toneup, tonedown) or smart context-aware inference
  • Audio Customization: Control loudness (LUFS -70 to 0), pitch (-12 to +12 semitones), tempo (0.5x to 2.0x), and format (WAV/MP3)
  • Async Support: Built-in async client for high-performance applications
  • Voice Discovery: V2 Voices API with filtering by model, gender, age, and use cases
  • Type Hints: Full type annotations with Pydantic models
  • Streaming: Real-time chunked audio delivery for low-latency playback

Configuration

You can configure the API key using environment variables or pass it directly to the client:
export TYPECAST_API_KEY="your-api-key-here"

Advanced Usage

Emotion Control (ssfm-v30)

ssfm-v30 offers two emotion control modes: Preset and Smart.
Let the AI infer emotion from context:
from typecast import Typecast
from typecast.models import TTSRequest, SmartPrompt

client = Typecast()

response = client.text_to_speech(TTSRequest(
    text="Everything is going to be okay.",
    model="ssfm-v30",
    voice_id="tc_672c5f5ce59fac2a48faeaee",
    prompt=SmartPrompt(
        emotion_type="smart",
        previous_text="I just got the best news!",  # Optional context
        next_text="I can't wait to celebrate!"      # Optional context
    )
))

Audio Customization

Control loudness, pitch, tempo, and output format:
from typecast import Typecast
from typecast.models import TTSRequest, Output

client = Typecast()

response = client.text_to_speech(TTSRequest(
    text="Customized audio output!",
    model="ssfm-v30",
    voice_id="tc_672c5f5ce59fac2a48faeaee",
    output=Output(
        target_lufs=-14.0,   # Range: -70 to 0 (LUFS)
        audio_pitch=2,       # Range: -12 to +12 semitones
        audio_tempo=1.2,     # Range: 0.5x to 2.0x
        audio_format="mp3"   # Options: wav, mp3
    ),
    seed=42                  # Unsigned seed for reproducible results
))

Voice Discovery (V2 API)

List and filter available voices with enhanced metadata:
from typecast import Typecast
from typecast.models import VoicesV2Filter, TTSModel, GenderEnum, AgeEnum

client = Typecast()

# Get all voices
voices = client.voices_v2()

# Filter by criteria
filtered = client.voices_v2(VoicesV2Filter(
    model=TTSModel.SSFM_V30,
    gender=GenderEnum.FEMALE,
    age=AgeEnum.YOUNG_ADULT
))

# Display voice info
for voice in voices:
    print(f"ID: {voice.voice_id}, Name: {voice.voice_name}")
    print(f"Gender: {voice.gender}, Age: {voice.age}")
    print(f"Models: {', '.join(m.version.value for m in voice.models)}")
    print(f"Use cases: {voice.use_cases}")

Async Client

For high-performance applications, use the async client:
import asyncio
from typecast import AsyncTypecast
from typecast.models import TTSRequest

async def main():
    async with AsyncTypecast() as client:
        response = await client.text_to_speech(TTSRequest(
            text="Hello from async!",
            model="ssfm-v30",
            voice_id="tc_672c5f5ce59fac2a48faeaee"
        ))

        with open('async_output.wav', 'wb') as f:
            f.write(response.audio_data)

asyncio.run(main())

Streaming

Stream audio chunks in real-time for low-latency playback:
# pip install requests sounddevice
import sounddevice as sd
from typecast import Typecast
from typecast.models import TTSRequestStream, OutputStream

client = Typecast()

request = TTSRequestStream(
    text="Stream this text as audio in real time.",
    model="ssfm-v30",
    voice_id="tc_672c5f5ce59fac2a48faeaee",
    output=OutputStream(audio_format="wav")
)

with sd.RawOutputStream(samplerate=32000, channels=1, dtype="int16") as player:
    buf, first = bytearray(), True
    for chunk in client.text_to_speech_stream(request):
        if first:
            chunk = chunk[44:]  # Skip 44-byte WAV header
            first = False
        buf.extend(chunk)
        n = len(buf) - (len(buf) % 2)  # int16 alignment
        if n:
            player.write(bytes(buf[:n]))
            del buf[:n]
WAV streaming format: 32000 Hz, 16-bit, mono PCM. The first chunk includes a 44-byte WAV header (size = 0xFFFFFFFF); subsequent chunks are raw PCM only. For MP3: 320 kbps, 44100 Hz, each chunk is independently decodable. The streaming endpoint does not support volume or target_lufs.

Supported Languages

Recommended: Use the LanguageCode enum for type-safe language selection. You can also pass the ISO 639-3 code as a string (e.g., "eng"). The SDK supports 37 languages with ISO 639-3 codes:
LanguageCodeLanguageCodeLanguageCode
EnglishengJapanesejpnUkrainianukr
KoreankorGreekellIndonesianind
SpanishspaTamiltamDanishdan
GermandeuTagalogtglSwedishswe
FrenchfraFinnishfinMalaymsa
ItalianitaChinesezhoCzechces
PolishpolSlovakslkPortuguesepor
DutchnldArabicaraBulgarianbul
RussianrusCroatianhrvRomanianron
BengalibenHindihinHungarianhun
HokkiennanNorwegiannorPunjabipan
ThaithaTurkishturVietnamesevie
Cantoneseyue
Use the LanguageCode enum for type-safe language selection:
from typecast.models import TTSRequest, LanguageCode

response = client.text_to_speech(TTSRequest(
    text="Hello",
    model="ssfm-v30",
    voice_id="tc_672c5f5ce59fac2a48faeaee",
    language=LanguageCode.ENG
))

Error Handling

The SDK provides specific exceptions for different HTTP status codes:
from typecast import (
    Typecast,
    TypecastError,
    BadRequestError,
    UnauthorizedError,
    PaymentRequiredError,
    NotFoundError,
    UnprocessableEntityError,
    RateLimitError,
    InternalServerError,
)

try:
    response = client.text_to_speech(request)
except UnauthorizedError:
    print("Invalid API key")
except PaymentRequiredError:
    print("Insufficient credits")
except RateLimitError:
    print("Rate limit exceeded - please retry later")
except TypecastError as e:
    print(f"Error {e.status_code}: {e.message}")
ExceptionStatus CodeDescription
BadRequestError400Invalid request parameters
UnauthorizedError401Invalid or missing API key
PaymentRequiredError402Insufficient credits
NotFoundError404Resource not found
UnprocessableEntityError422Validation error
RateLimitError429Rate limit exceeded
InternalServerError500Server error