Dart/Flutter

The official Dart and Flutter SDK for the Typecast API. Convert text to lifelike speech, stream audio, generate timestamps, discover voices, and create custom voices from Dart or Flutter applications.

pub.dev

Typecast Dart SDK

Source Code

Typecast Dart SDK Source Code

Installation

Install from pub.dev:

dart pub add typecast_dart

Latest registered version: 0.1.7 on pub.dev.

For Flutter projects:

flutter pub add typecast_dart
flutter pub add audioplayers

Use typecast_dart 0.1.7 or higher. For production Flutter apps, avoid embedding a long-lived API key in a distributed client. Route requests through your backend when the API key must remain private.

Quick Start

import 'package:audioplayers/audioplayers.dart';
import 'package:typecast_dart/typecast_dart.dart';

final client = TypecastClient(apiKey: 'YOUR_API_KEY');
final player = AudioPlayer();

Future<void> speakAndPlay() async {
  final response = await client.textToSpeech(
    const TtsRequest(
      voiceId: 'tc_672c5f5ce59fac2a48faeaee', // Find voice IDs at https://studio.typecast.ai/developers/api/voices
      text: "Hello there! I'm your friendly text-to-speech agent.",
      model: TtsModel.ssfmV30,
      language: LanguageCode.eng,
      output: Output(audioFormat: AudioFormat.wav),
    ),
  );

  await player.play(BytesSource(response.audioData));
  print('Duration: ${response.duration}s, Format: ${response.format.value}');
}

Playback in Flutter

The Dart SDK returns generated audio as bytes. In Flutter, pass those bytes to an audio playback package such as audioplayers. Use one shared AudioPlayer instance and play each response directly from memory:

import 'package:audioplayers/audioplayers.dart';
import 'package:typecast_dart/typecast_dart.dart';

final client = TypecastClient(apiKey: 'YOUR_API_KEY');
final player = AudioPlayer();

Future<void> playTts(String text) async {
  final response = await client.textToSpeech(
    TtsRequest(
      voiceId: 'tc_672c5f5ce59fac2a48faeaee',
      text: text,
      model: TtsModel.ssfmV30,
      language: LanguageCode.eng,
      output: const Output(audioFormat: AudioFormat.wav),
    ),
  );

  await player.play(BytesSource(response.audioData));
}

For production Flutter apps, keep long-lived API keys on your backend. The Flutter app can request generated audio from your backend and still play the returned bytes with BytesSource.

Features

Multiple Voice Models: Support for ssfm-v30 and ssfm-v21 AI voice models
Multi-language Support: 37 languages including English, Korean, Japanese, Chinese, Spanish, and more
Emotion Control: Preset emotions or smart context-aware inference
Audio Customization: Control loudness, pitch, tempo, and output format
Voice Discovery: V2 Voices API with filtering by model, gender, age, and use cases
Streaming: Access the streaming TTS endpoint as a Dart Stream<List<int>>
Timestamp TTS: Word- and character-level alignment data with SRT/VTT helpers
Instant Voice Cloning: Upload a WAV sample and create a custom voice ID
Dart and Flutter: Use the same package in Dart CLI, server, and Flutter projects

Voice Recommendations

Use recommendVoices when you know the desired style but not the exact voice_id.

final voices = await client.recommendVoices(
  'warm female voice for a product tutorial',
  count: 3,
);

for (final voice in voices) {
  print('${voice.voiceId} ${voice.voiceName} ${voice.score}');
}

Recommendation results contain only voiceId, voiceName, and score. Use getVoiceV2 or getVoicesV2 when you need detailed metadata such as supported models, emotions, gender, age, or use cases.

Configuration

Set your API key via environment variable or constructor:

export TYPECAST_API_KEY="your-api-key-here"

import 'dart:io';
import 'package:typecast_dart/typecast_dart.dart';

final client = TypecastClient(
  apiKey: Platform.environment['TYPECAST_API_KEY'],
);

import 'package:typecast_dart/typecast_dart.dart';

final client = TypecastClient(
  apiKey: 'your-api-key-here',
);

When requests go through your own proxy, set baseUrl to the proxy endpoint and omit apiKey. The SDK will not send the X-API-KEY header for empty or missing keys. Requests to the default Typecast host still require an API key.

Proxy without API key

final client = TypecastClient(
  baseUrl: 'https://your-proxy.example.com',
);

Advanced Usage

Emotion Control (ssfm-v30)

ssfm-v30 offers two emotion control modes: Preset and Smart.

Smart Mode
Preset Mode

Let the AI infer emotion from context:

final response = await client.textToSpeech(
  const TtsRequest(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Everything is going to be okay.',
    model: TtsModel.ssfmV30,
    prompt: SmartPrompt(
      previousText: 'I just got the best news!',
      nextText: "I can't wait to celebrate!",
    ),
  ),
);

await player.play(BytesSource(response.audioData));

Explicitly set emotion with preset values:

final response = await client.textToSpeech(
  const TtsRequest(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'I am so excited to show you these features!',
    model: TtsModel.ssfmV30,
    prompt: PresetPrompt(
      emotionPreset: EmotionPreset.happy,
      emotionIntensity: 1.5,
    ),
  ),
);

await player.play(BytesSource(response.audioData));

Audio Customization

Control loudness, pitch, tempo, and output format:

final response = await client.textToSpeech(
  const TtsRequest(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Customized audio output!',
    model: TtsModel.ssfmV30,
    output: Output(
      targetLufs: -14.0,
      audioPitch: 2,
      audioTempo: 1.2,
      audioFormat: AudioFormat.mp3,
    ),
    seed: 42,
  ),
);

await player.play(BytesSource(response.audioData));

Generate audio to a file

Use generateToFile when you want the SDK to synthesize speech and write the audio bytes directly to a local file. The model defaults to ssfm-v30, and .mp3 / .wav extensions infer the output format when no output format is set. Browse available voice IDs on the Voices page.

await client.generateToFile(
  'output.mp3',
  GenerateToFileRequest(
    text: 'Hello from Typecast.',
    voiceId: 'tc_672c5f5ce59fac2a48faeaee', // Find voice IDs at https://studio.typecast.ai/developers/api/voices
  ),
);

Text pauses

Use text pause markup when you only need silent gaps inside one composed text segment. Put <|5s|>, <|1s|>, <|0.3s|>, or <|0.34413s|> directly in the text. The value is interpreted as seconds and must end with s. This keeps the pause expression visible in plain text without adding separate pause calls.

final audio = await client
    .composeSpeech()
    .defaults(ComposerSettings(voiceId: 'tc_672c5f5ce59fac2a48faeaee', model: TTSModel.ssfmV30))
    .say('Hello<|5s|>Nice to meet you<|1s|>Today<|2s|>how does the weather feel?')
    .generate();

Multi-speaker composition

Use the composer chaining API when one output file needs different voices or per-segment options such as pitch, tempo, prompt, or seed. The composer generates each segment as WAV, trims leading/trailing silent PCM samples, and concatenates the result. If you need MP3, generate WAV first and convert it in your app or server pipeline.

final audio = await client
    .composeSpeech()
    .defaults(const ComposerSettings(voiceId: 'tc_672c5f5ce59fac2a48faeaee', model: TtsModel.ssfmV30))
    .say('Hello there')
    .pause(5)
    .say(
      'Nice to meet you',
      overrides: const ComposerSettings(
        voiceId: 'tc_60e5426de8b95f1d3000d7b5',
        output: Output(audioPitch: 2),
      ),
    )
    .say('Today')
  .pause(2)
  .say('How does the weather feel?')
    .generate();

await File('conversation.wav').writeAsBytes(audio.audioData);

Voice Discovery (V2 API)

List and filter available voices with enhanced metadata:

final voices = await client.getVoicesV2();

final filtered = await client.getVoicesV2(
  const VoicesV2Filter(
    model: TtsModel.ssfmV30,
    gender: 'female',
    age: 'young_adult',
  ),
);

for (final voice in voices) {
  print('ID: ${voice.voiceId}, Name: ${voice.voiceName}');
  print('Gender: ${voice.gender}, Age: ${voice.age}');
  print('Models: ${voice.models.map((model) => model.version).join(', ')}');
}

final voice = await client.getVoiceV2('tc_672c5f5ce59fac2a48faeaee');
print(voice.voiceName);

Streaming

Consume streaming audio as a Dart stream and play it without writing a file:

import 'dart:typed_data';

final stream = await client.textToSpeechStream(
  const TtsRequestStream(
    voiceId: 'tc_672c5f5ce59fac2a48faeaee',
    text: 'Stream this text as audio in real time.',
    model: TtsModel.ssfmV30,
    output: OutputStream(audioFormat: AudioFormat.wav),
  ),
);

final audioBytes = <int>[];
await for (final chunk in stream) {
  audioBytes.addAll(chunk);
}

await player.play(BytesSource(Uint8List.fromList(audioBytes)));

WAV streaming format: 32000 Hz, 16-bit, mono PCM. The first chunk includes a 44-byte WAV header (size = 0xFFFFFFFF); subsequent chunks are raw PCM only. The example above avoids file storage and plays the complete stream from memory. For true low-latency chunk-by-chunk playback, feed the PCM chunks into a streaming audio engine instead of audioplayers.

Timestamp TTS

textToSpeechWithTimestamps() wraps POST /v1/text-to-speech/with-timestamps and returns audio together with word- or character-level alignment data.

final result = await client.textToSpeechWithTimestamps(
  const TtsRequest(
    voiceId: 'tc_60e5426de8b95f1d3000d7b5',
    text: 'Hello. How are you?',
    model: TtsModel.ssfmV30,
  ),
);

await player.play(BytesSource(result.audioBytes()));
print('Duration: ${result.audioDuration}s');

for (final word in result.words) {
  print('[${word.startTime}s - ${word.endTime}s] ${word.word}');
}

Granularity

Pass granularity: 'word' (default) or granularity: 'char' to control the alignment unit.

final result = await client.textToSpeechWithTimestamps(
  const TtsRequest(
    voiceId: 'tc_60e5426de8b95f1d3000d7b5',
    text: 'Hello. How are you?',
    model: TtsModel.ssfmV30,
  ),
  granularity: 'char',
);

Subtitle Export

await File('output.srt').writeAsString(result.toSrt());
await File('output.vtt').writeAsString(result.toVtt());

Instant Voice Cloning

Upload a short WAV sample to create a custom voice:

final voice = await client.cloneVoice(
  audio: await File('sample.wav').readAsBytes(),
  filename: 'sample.wav',
  name: 'My Voice',
  model: TtsModel.ssfmV30,
);

print('Custom voice ID: ${voice.voiceId}');

Voice cloning audio must be 25 MB or smaller, and the custom voice name must be 1-30 characters.

GET STARTED

SDKs

INTEGRATIONS

pub.dev

Source Code

Installation

Quick Start

Playback in Flutter

Features

Voice Recommendations

Configuration

Advanced Usage

Emotion Control (ssfm-v30)

Audio Customization

Generate audio to a file

Text pauses

Multi-speaker composition

Voice Discovery (V2 API)

Streaming

Timestamp TTS

Granularity

Subtitle Export

Instant Voice Cloning

pub.dev

Source Code

​Installation

​Quick Start

​Playback in Flutter

​Features

​Voice Recommendations

​Configuration

​Advanced Usage

​Emotion Control (ssfm-v30)

​Audio Customization

​Generate audio to a file

​Text pauses

​Multi-speaker composition

​Voice Discovery (V2 API)

​Streaming

​Timestamp TTS

​Granularity

​Subtitle Export

​Instant Voice Cloning

Installation

Quick Start

Playback in Flutter

Features

Voice Recommendations

Configuration

Advanced Usage

Emotion Control (ssfm-v30)

Audio Customization

Generate audio to a file

Text pauses

Multi-speaker composition

Voice Discovery (V2 API)

Streaming

Timestamp TTS

Granularity

Subtitle Export

Instant Voice Cloning