Skip to main content
The official Dart and Flutter SDK for the Typecast API. Convert text to lifelike speech, stream audio, generate timestamps, discover voices, and create custom voices from Dart or Flutter applications.

pub.dev

Typecast Dart SDK

Source Code

Typecast Dart SDK Source Code

Installation

Install from pub.dev:
dart pub add typecast_dart
For Flutter projects:
flutter pub add typecast_dart
Use typecast_dart 0.1.0 or higher. For production Flutter apps, avoid embedding a long-lived API key in a distributed client. Route requests through your backend when the API key must remain private.

Quick Start

import 'dart:io';
import 'package:typecast_dart/typecast_dart.dart';

Future<void> main() async {
  final client = TypecastClient(
    apiKey: Platform.environment['TYPECAST_API_KEY'],
  );

  final response = await client.textToSpeech(
    const TtsRequest(
      voiceId: 'tc_672c5f5ce59fac2a48faeaee',
      text: "Hello there! I'm your friendly text-to-speech agent.",
      model: TtsModel.ssfmV30,
      language: LanguageCode.eng,
      output: Output(audioFormat: AudioFormat.wav),
    ),
  );

  await File('output.wav').writeAsBytes(response.audioData);
  print('Duration: ${response.duration}s, Format: ${response.format.value}');
}

Features

  • Multiple Voice Models: Support for ssfm-v30 and ssfm-v21 AI voice models
  • Multi-language Support: 37 languages including English, Korean, Japanese, Chinese, Spanish, and more
  • Emotion Control: Preset emotions or smart context-aware inference
  • Audio Customization: Control loudness, pitch, tempo, and output format
  • Voice Discovery: V2 Voices API with filtering by model, gender, age, and use cases
  • Streaming: Access the streaming TTS endpoint as a Dart Stream<List<int>>
  • Timestamp TTS: Word- and character-level alignment data with SRT/VTT helpers
  • Instant Voice Cloning: Upload a WAV sample and create a custom voice ID
  • Dart and Flutter: Use the same package in Dart CLI, server, and Flutter projects

Configuration

Set your API key via environment variable or constructor:
export TYPECAST_API_KEY="your-api-key-here"

Advanced Usage

Emotion Control (ssfm-v30)

ssfm-v30 offers two emotion control modes: Preset and Smart.
Let the AI infer emotion from context:
final response = await client.textToSpeech(
  const TtsRequest(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Everything is going to be okay.',
    model: TtsModel.ssfmV30,
    prompt: SmartPrompt(
      previousText: 'I just got the best news!',
      nextText: "I can't wait to celebrate!",
    ),
  ),
);

Audio Customization

Control loudness, pitch, tempo, and output format:
final response = await client.textToSpeech(
  const TtsRequest(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Customized audio output!',
    model: TtsModel.ssfmV30,
    output: Output(
      targetLufs: -14.0,
      audioPitch: 2,
      audioTempo: 1.2,
      audioFormat: AudioFormat.mp3,
    ),
    seed: 42,
  ),
);

await File('output.mp3').writeAsBytes(response.audioData);

Voice Discovery (V2 API)

List and filter available voices with enhanced metadata:
final voices = await client.getVoicesV2();

final filtered = await client.getVoicesV2(
  const VoicesV2Filter(
    model: TtsModel.ssfmV30,
    gender: 'female',
    age: 'young_adult',
  ),
);

for (final voice in voices) {
  print('ID: ${voice.voiceId}, Name: ${voice.voiceName}');
  print('Gender: ${voice.gender}, Age: ${voice.age}');
  print('Models: ${voice.models.map((model) => model.version).join(', ')}');
}

final voice = await client.getVoiceV2('tc_672c5f5ce59fac2a48faeaee');
print(voice.voiceName);

Streaming

Consume streaming audio as a Dart stream:
final stream = await client.textToSpeechStream(
  const TtsRequestStream(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Stream this text as audio in real time.',
    model: TtsModel.ssfmV30,
    output: OutputStream(audioFormat: AudioFormat.wav),
  ),
);

final file = File('stream.wav').openWrite();
await for (final chunk in stream) {
  file.add(chunk);
}
await file.close();
WAV streaming format: 32000 Hz, 16-bit, mono PCM. The first chunk includes a 44-byte WAV header (size = 0xFFFFFFFF); subsequent chunks are raw PCM only. The streaming endpoint does not support volume or target_lufs.

Timestamp TTS

textToSpeechWithTimestamps() wraps POST /v1/text-to-speech/with-timestamps and returns audio together with word- or character-level alignment data.
final result = await client.textToSpeechWithTimestamps(
  const TtsRequest(
    voiceId: 'tc_60e5426de8b95f1d3000d7b5',
    text: 'Hello. How are you?',
    model: TtsModel.ssfmV30,
  ),
);

await result.saveAudio('output.wav');
print('Duration: ${result.audioDuration}s');

for (final word in result.words) {
  print('[${word.startTime}s - ${word.endTime}s] ${word.word}');
}

Granularity

Pass granularity: 'word' (default) or granularity: 'char' to control the alignment unit.
final result = await client.textToSpeechWithTimestamps(
  const TtsRequest(
    voiceId: 'tc_60e5426de8b95f1d3000d7b5',
    text: 'Hello. How are you?',
    model: TtsModel.ssfmV30,
  ),
  granularity: 'char',
);

Subtitle Export

await File('output.srt').writeAsString(result.toSrt());
await File('output.vtt').writeAsString(result.toVtt());

Instant Voice Cloning

Upload a short WAV sample to create a custom voice:
final voice = await client.cloneVoice(
  audio: await File('sample.wav').readAsBytes(),
  filename: 'sample.wav',
  name: 'My Voice',
  model: TtsModel.ssfmV30,
);

print('Custom voice ID: ${voice.voiceId}');
Voice cloning audio must be 25 MB or smaller, and the custom voice name must be 1-30 characters.