Skip to main content
The official Swift library for the Typecast API. Convert text to lifelike speech using AI-powered voices. Compatible with Swift 5.9+ and supports all Apple platforms: iOS, macOS, tvOS, watchOS, and visionOS.

Swift Package

Typecast Swift SDK

Source Code

Typecast Swift SDK Source Code

Requirements

PlatformMinimum Version
iOS13.0+
macOS10.15+
tvOS13.0+
watchOS6.0+
visionOS1.0+
Swift5.9+

Installation

Add the following to your Package.swift:
dependencies: [
    .package(url: "https://github.com/neosapience/typecast-sdk.git", from: "1.0.0")
]
Then add Typecast to your target dependencies:
targets: [
    .target(
        name: "YourTarget",
        dependencies: ["Typecast"]
    )
]
Make sure you have Swift 5.9 or higher installed. The SDK uses Swift Concurrency (async/await) which requires this minimum version.

Quick Start

import Typecast

let client = TypecastClient(apiKey: "YOUR_API_KEY")

// Simple usage with convenience method
let audio = try await client.speak(
    "Hello! I'm your friendly text-to-speech assistant.",
    voiceId: "tc_672c5f5ce59fac2a48faeaee"
)

// Save to file
let url = URL(fileURLWithPath: "output.\(audio.format.rawValue)")
try audio.audioData.write(to: url)
print("Audio saved! Duration: \(audio.duration)s, Format: \(audio.format.rawValue)")

Features

The Typecast Swift SDK provides powerful features for text-to-speech conversion:
  • Multiple Voice Models: Support for ssfm-v30 (latest) and ssfm-v21 AI voice models
  • Multi-language Support: 37 languages including English, Korean, Spanish, Japanese, Chinese, and more
  • Emotion Control: Preset emotions (normal, happy, sad, angry, whisper, toneup, tonedown) or smart context-aware inference
  • Audio Customization: Control loudness (LUFS -70 to 0), pitch (-12 to +12 semitones), tempo (0.5x to 2.0x), and format (WAV/MP3)
  • Voice Discovery: V2 Voices API with filtering by model, gender, age, and use cases
  • Swift Concurrency: Full async/await support for modern Swift development
  • Thread-Safe: All types conform to Sendable for safe concurrent usage
  • Cross-Platform: Works on iOS, macOS, tvOS, watchOS, and visionOS

Configuration

Initialize the client with your API key:
import Typecast

// Direct initialization
let client = TypecastClient(apiKey: "your-api-key")

// With custom base URL
let client = TypecastClient(
    apiKey: "your-api-key",
    baseURL: "https://api.typecast.ai"
)

// Using configuration struct
let config = TypecastConfiguration(apiKey: "your-api-key")
let client = TypecastClient(configuration: config)

Advanced Usage

Emotion Control (ssfm-v30)

ssfm-v30 offers two emotion control modes: Preset and Smart.
Let the AI infer emotion from context:
let request = TTSRequest(
    voiceId: "tc_672c5f5ce59fac2a48faeaee",
    text: "Everything is going to be okay.",
    model: .ssfmV30,
    prompt: .smart(SmartPrompt(
        previousText: "I just got the best news!",  // Optional context
        nextText: "I can't wait to celebrate!"      // Optional context
    ))
)

let response = try await client.textToSpeech(request)

Audio Customization

Control loudness, pitch, tempo, and output format:
let request = TTSRequest(
    voiceId: "tc_672c5f5ce59fac2a48faeaee",
    text: "Customized audio output!",
    model: .ssfmV30,
    output: OutputSettings(
        targetLufs: -14.0,     // Range: -70 to 0 (LUFS)
        audioPitch: 2,        // Range: -12 to +12 semitones
        audioTempo: 1.2,      // Range: 0.5x to 2.0x
        audioFormat: .mp3     // Options: .wav, .mp3
    ),
    seed: 42  // For reproducible results
)

let response = try await client.textToSpeech(request)

try response.audioData.write(to: URL(fileURLWithPath: "output.\(response.format.rawValue)"))
print("Duration: \(response.duration)s, Format: \(response.format.rawValue)")

Voice Discovery (V2 API)

List and filter available voices with enhanced metadata:
// Get all voices
let voices = try await client.getVoices()

// Filter by criteria
let filteredVoices = try await client.getVoices(filter: VoicesV2Filter(
    model: .ssfmV30,
    gender: .female,
    age: .youngAdult
))

// Get a specific voice
let voice = try await client.getVoice(voiceId: "tc_672c5f5ce59fac2a48faeaee")

// Display voice info
print("ID: \(voice.voiceId), Name: \(voice.voiceName)")
print("Gender: \(voice.gender?.rawValue ?? "N/A"), Age: \(voice.age?.rawValue ?? "N/A")")

for model in voice.models {
    print("Model: \(model.version.rawValue), Emotions: \(model.emotions.joined(separator: ", "))")
}

if let useCases = voice.useCases {
    print("Use cases: \(useCases.joined(separator: ", "))")
}

Multilingual Content

The SDK supports 37 languages with automatic language detection:
// Auto-detect language (recommended)
let request = TTSRequest(
    voiceId: "tc_672c5f5ce59fac2a48faeaee",
    text: "こんにちは。お元気ですか。",
    model: .ssfmV30
)

let response = try await client.textToSpeech(request)

// Or specify language explicitly
let koreanRequest = TTSRequest(
    voiceId: "tc_672c5f5ce59fac2a48faeaee",
    text: "안녕하세요. 반갑습니다.",
    model: .ssfmV30,
    language: .korean  // Explicit language code
)

let koreanResponse = try await client.textToSpeech(koreanRequest)

try koreanResponse.audioData.write(to: URL(fileURLWithPath: "output.wav"))

Supported Languages

The SDK supports 37 languages with automatic language detection:
CodeLanguageCodeLanguageCodeLanguage
.englishEnglish.japaneseJapanese.ukrainianUkrainian
.koreanKorean.greekGreek.indonesianIndonesian
.spanishSpanish.tamilTamil.danishDanish
.germanGerman.tagalogTagalog.swedishSwedish
.frenchFrench.finnishFinnish.malayMalay
.italianItalian.chineseChinese.czechCzech
.polishPolish.slovakSlovak.portuguesePortuguese
.dutchDutch.arabicArabic.bulgarianBulgarian
.russianRussian.croatianCroatian.romanianRomanian
.bengaliBengali.hindiHindi.hungarianHungarian
.minNanHokkien.norwegianNorwegian.punjabiPunjabi
.thaiThai.turkishTurkish.vietnameseVietnamese
.cantoneseCantonese
If not specified, the language will be automatically detected from the input text.

Error Handling

The SDK provides a comprehensive TypecastError enum for handling API errors:
import Typecast

do {
    let response = try await client.textToSpeech(request)
} catch let error as TypecastError {
    switch error {
    case .unauthorized(let message):
        // 401: Invalid API key
        print("Invalid API key: \(message)")
    case .paymentRequired(let message):
        // 402: Insufficient credits
        print("Insufficient credits: \(message)")
    case .notFound(let message):
        // 404: Resource not found
        print("Voice not found: \(message)")
    case .validationError(let message):
        // 422: Validation error
        print("Validation error: \(message)")
    case .rateLimitExceeded(let message):
        // 429: Rate limit exceeded
        print("Rate limit exceeded: \(message)")
    case .serverError(let message):
        // 500: Server error
        print("Server error: \(message)")
    case .networkError(let underlyingError):
        // Network connectivity issues
        print("Network error: \(underlyingError.localizedDescription)")
    case .invalidResponse(let message):
        // Invalid response from server
        print("Invalid response: \(message)")
    default:
        print("Error: \(error.localizedDescription)")
    }
    
    // Access status code if available
    if let statusCode = error.statusCode {
        print("HTTP Status: \(statusCode)")
    }
}

Error Types

ErrorStatus CodeDescription
.badRequest400Invalid request parameters
.unauthorized401Invalid or missing API key
.paymentRequired402Insufficient credits
.notFound404Resource not found
.validationError422Validation error
.rateLimitExceeded429Rate limit exceeded
.serverError500Server error
.networkError-Network connectivity issues
.invalidResponse-Invalid response from server

Platform-Specific Usage

iOS

import Typecast
import AVFoundation

class TTSManager {
    private let client = TypecastClient(apiKey: "YOUR_API_KEY")
    private var audioPlayer: AVAudioPlayer?
    
    func speak(_ text: String) async throws {
        let audio = try await client.speak(text, voiceId: "tc_672c5f5ce59fac2a48faeaee")
        
        // Play audio directly from data
        audioPlayer = try AVAudioPlayer(data: audio.audioData)
        audioPlayer?.play()
    }
}

macOS

import Typecast
import AppKit
import AVFoundation

class MacTTSManager {
    private let client = TypecastClient(apiKey: "YOUR_API_KEY")
    private var audioPlayer: AVAudioPlayer?
    
    func speak(_ text: String) async throws {
        let audio = try await client.speak(text, voiceId: "tc_672c5f5ce59fac2a48faeaee")
        
        audioPlayer = try AVAudioPlayer(data: audio.audioData)
        audioPlayer?.play()
    }
    
    func saveWithPanel(_ text: String) async throws {
        let audio = try await client.speak(text, voiceId: "tc_672c5f5ce59fac2a48faeaee")
        
        let savePanel = NSSavePanel()
        savePanel.allowedContentTypes = [.audio]
        savePanel.nameFieldStringValue = "speech.\(audio.format.rawValue)"
        
        if savePanel.runModal() == .OK, let url = savePanel.url {
            try audio.audioData.write(to: url)
        }
    }
}

watchOS

import Typecast
import WatchKit

class WatchTTSManager {
    private let client = TypecastClient(apiKey: "YOUR_API_KEY")
    
    func speak(_ text: String) async throws {
        let audio = try await client.speak(text, voiceId: "tc_672c5f5ce59fac2a48faeaee")
        
        // Save to temporary file and play
        let tempURL = FileManager.default.temporaryDirectory
            .appendingPathComponent("speech.\(audio.format.rawValue)")
        try audio.audioData.write(to: tempURL)
        
        // Use WKAudioFilePlayer for watchOS
        let asset = WKAudioFileAsset(url: tempURL)
        let playerItem = WKAudioFilePlayerItem(asset: asset)
        let player = WKAudioFilePlayer(playerItem: playerItem)
        player.play()
    }
}

API Reference

TypecastClient Methods

MethodDescription
textToSpeech(_:)Convert text to speech audio
speak(_:voiceId:model:)Simple TTS with minimal parameters
speak(_:voiceId:model:emotion:intensity:)TTS with emotion preset
getVoices(filter:)Get available voices with optional filter
getVoice(voiceId:)Get a specific voice by ID

TTSRequest Fields

FieldTypeRequiredDescription
voiceIdStringVoice ID (format: tc_*)
textStringText to synthesize (max 2000 chars)
modelTTSModelTTS model (.ssfmV21 or .ssfmV30)
languageLanguageCodeLanguage code (auto-detected if omitted)
promptTTSPromptEmotion settings (.basic, .preset, or .smart)
outputOutputSettingsAudio output settings
seedIntRandom seed for reproducibility

TTSResponse Fields

FieldTypeDescription
audioDataDataGenerated audio data
durationTimeIntervalAudio duration in seconds
formatAudioFormatAudio format (.wav or .mp3)