Latest Model: ssfm-v30
Our newest Speech Synthesis Foundation Model (ssfm-v30) delivers the most natural and expressive speech synthesis yet, with significant improvements in prosody, pacing, and emotional expression.Key Features
Smart Emotion
Automatically detects the appropriate emotion from text context and applies it to the voice. Simply provide surrounding text, and the model infers the optimal emotional tone.
7 Emotion Presets
Choose from
normal, happy, sad, angry, whisper, toneup, and tonedown presets to fine-tune your voice output.37 Languages
Support for 37 languages including Korean, English, Japanese, Chinese, Spanish, Vietnamese, and many more.
Universal Emotion Support
All emotion presets are available across all voices, giving you consistent control over emotional expression.
Emotion Control Options
ssfm-v30 offers two ways to control emotional expression:| Option | Description | Best For |
|---|---|---|
| Smart Emotion | AI automatically infers emotion from text context | Dialogue, storytelling, natural conversations |
| Preset Emotion | Manually select emotion preset and intensity | Precise control, specific emotional requirements |
Supported Languages (37)
English, Korean, Arabic, Bengali, Bulgarian, Cantonese, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, French, German, Greek, Hindi, Hungarian, Indonesian, Italian, Japanese, Malay, Min Nan, Norwegian, Polish, Portuguese, Punjabi, Romanian, Russian, Slovak, Spanish, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, VietnameseUse Cases
Typecast’s API is designed for a wide range of applications:- Conversational AI - Build natural-sounding chatbots and voice assistants
- Video Production - Generate professional voiceovers for videos and documentaries
- Advertising - Produce compelling ad voiceovers quickly
- E-learning & Education - Create engaging educational content with natural narration
- Podcasts & Broadcasting - Create consistent, high-quality audio content
- Game Development - Add dynamic character voices to games
- Audiobooks & Storytelling - Produce expressive audiobook narration with emotional depth
Audio Output Specifications
| Format | Codec | Bit Depth | Channels | Sample Rate | Bitrate |
|---|---|---|---|---|---|
| WAV | PCM (Uncompressed) | 16-bit | Mono | 44,100 Hz | N/A |
| MP3 | MPEG Layer III | N/A | Mono | 44,100 Hz | 320 kbps |
WAV format provides higher quality audio suitable for professional production, while MP3 offers smaller file sizes ideal for web streaming and distribution.