Skip to main content
Pipecat is an open-source framework for building real-time, multimodal AI voice agents. With the Typecast TTS integration, you can add high-quality neural voices with emotion control to your voice AI pipelines.

What is Pipecat?

Pipecat is a Python framework that simplifies building voice AI applications. It connects various services (speech-to-text, LLMs, text-to-speech) into a unified pipeline, handling the complexity of real-time audio streaming, turn-taking, and transport protocols. A typical Pipecat pipeline looks like this:
User Audio → STT → LLM → TTS → Bot Audio
The Typecast TTS service (pipecat-ai-typecast) integrates seamlessly into this pipeline, converting LLM responses into expressive speech.

What You Can Do

With the Typecast Pipecat integration, you can:
  • Build voice AI agents with natural, expressive voices
  • Choose from 500+ voices with different genders, ages, and styles
  • Apply emotions (happy, sad, angry, whisper, and more)
  • Use Smart Emotion for context-aware voice synthesis
  • Deploy anywhere — Daily, Twilio, or native WebRTC

Prerequisites

Before you start, make sure you have:
RequirementVersion
Python3.10+
Pipecatv0.0.94+
Typecast API KeyGet yours here

Installation

Install the Typecast TTS service for Pipecat:
pip install pipecat-ai-typecast
Using uv? Run uv add pipecat-ai-typecast instead.

Quick Start

Here’s a minimal example of integrating Typecast TTS into a Pipecat pipeline:
import os
import aiohttp
from pipecat.pipeline.pipeline import Pipeline
from pipecat_typecast import TypecastTTSService

async with aiohttp.ClientSession() as session:
    # Initialize Typecast TTS
    tts = TypecastTTSService(
        aiohttp_session=session,
        api_key=os.getenv("TYPECAST_API_KEY"),
        voice_id=os.getenv("TYPECAST_VOICE_ID", "tc_672c5f5ce59fac2a48faeaee"),
    )

    # Build your pipeline
    pipeline = Pipeline([
        transport.input(),               # User audio input
        stt,                             # Speech-to-text
        context_aggregator.user(),       # Add user text to context
        llm,                             # LLM generates response
        tts,                             # Typecast TTS synthesis
        transport.output(),              # Stream audio to user
        context_aggregator.assistant(),  # Store assistant response
    ])
Set your environment variables:
  • TYPECAST_API_KEY — Your Typecast API key (required)
  • TYPECAST_VOICE_ID — Voice to use (optional, defaults to a preset voice)

Configuration

The TypecastTTSService supports both preset-based and context-aware emotion control.

Basic Configuration

from pipecat_typecast import TypecastTTSService

tts = TypecastTTSService(
    aiohttp_session=session,
    api_key=os.getenv("TYPECAST_API_KEY"),
    voice_id="tc_672c5f5ce59fac2a48faeaee",
    model="ssfm-v30",  # Latest model (default)
)

Preset Emotion Control

Choose from predefined emotions for consistent voice styling:
from pipecat_typecast import (
    TypecastTTSService,
    TypecastInputParams,
    PresetPromptOptions,
    OutputOptions,
)

params = TypecastInputParams(
    prompt_options=PresetPromptOptions(
        emotion_preset="happy",      # normal | happy | sad | angry | whisper | toneup | tonedown
        emotion_intensity=1.3,       # 0.0 - 2.0
    ),
    output_options=OutputOptions(
        volume=110,                  # 0 - 200 (percent)
        audio_pitch=2,               # -12 to 12 (semitones)
        audio_tempo=1.05,            # 0.5 - 2.0 (playback speed)
    ),
)

tts = TypecastTTSService(
    aiohttp_session=session,
    api_key=os.getenv("TYPECAST_API_KEY"),
    params=params,
)

Smart Emotion (Context-Aware)

Let the AI automatically infer emotion from surrounding text:
from pipecat_typecast import (
    TypecastTTSService,
    TypecastInputParams,
    SmartPromptOptions,
)

params = TypecastInputParams(
    prompt_options=SmartPromptOptions(
        previous_text="I just got the best news ever!",   # max 2000 chars
        next_text="I can't wait to share this with everyone!",
    ),
)

tts = TypecastTTSService(
    aiohttp_session=session,
    api_key=os.getenv("TYPECAST_API_KEY"),
    params=params,
)

Preset Emotion

Manually choose from 7 emotions: Normal, Happy, Sad, Angry, Whisper, Tone Up, Tone Down.Best for consistent voice styling.

Smart Emotion

AI automatically detects the best emotion from text context.Best for natural conversations.

Parameter Reference

ParameterRangeDescription
emotion_presetvaries by voicessfm-v30: normal, happy, sad, angry, whisper, toneup, tonedown
emotion_intensity0.0 - 2.0Values > 1.0 increase expressiveness
audio_pitch-12 to 12Semitone adjustment
audio_tempo0.5 - 2.0Recommended: 0.85 - 1.15
volume0 - 200Audio volume as percentage
seedintegerDeterministic synthesis for identical text

Supported Transports

Pipecat supports multiple transport protocols. Typecast works with all of them:
Daily provides WebRTC-based video and audio infrastructure.
from pipecat.transports.daily.transport import DailyParams

transport_params = DailyParams(
    audio_in_enabled=True,
    audio_out_enabled=True,
    vad_analyzer=SileroVADAnalyzer(),
)

Complete Example

Here’s a full working example that creates a voice AI agent:
import os
import aiohttp
from dotenv import load_dotenv

from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.daily.transport import DailyParams, DailyTransport

from pipecat_typecast import TypecastTTSService

load_dotenv()

async def main():
    async with aiohttp.ClientSession() as session:
        # Initialize services
        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
        tts = TypecastTTSService(
            aiohttp_session=session,
            api_key=os.getenv("TYPECAST_API_KEY"),
        )

        # Set up conversation context
        messages = [
            {
                "role": "system",
                "content": "You are a helpful AI assistant. Keep responses concise.",
            },
        ]
        context = LLMContext(messages)
        context_aggregator = LLMContextAggregatorPair(context)

        # Configure transport
        transport = DailyTransport(
            room_url=os.getenv("DAILY_ROOM_URL"),
            token=os.getenv("DAILY_TOKEN"),
            params=DailyParams(
                audio_in_enabled=True,
                audio_out_enabled=True,
                vad_analyzer=SileroVADAnalyzer(),
            ),
        )

        # Build and run pipeline
        pipeline = Pipeline([
            transport.input(),
            stt,
            context_aggregator.user(),
            llm,
            tts,
            transport.output(),
            context_aggregator.assistant(),
        ])

        task = PipelineTask(pipeline, params=PipelineParams())
        runner = PipelineRunner()
        await runner.run(task)

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Legacy Model (ssfm-v21)

If you need to use the legacy ssfm-v21 model:
from pipecat_typecast import (
    TypecastTTSService,
    TypecastInputParams,
    PromptOptions,
)

params = TypecastInputParams(
    prompt_options=PromptOptions(
        emotion_preset="happy",      # normal | happy | sad | angry
        emotion_intensity=1.3,
    ),
)

tts = TypecastTTSService(
    aiohttp_session=session,
    api_key=os.getenv("TYPECAST_API_KEY"),
    model="ssfm-v21",
    params=params,
)
Note: ssfm-v21 supports fewer emotion presets (no whisper, toneup, tonedown).

Troubleshooting

  • Ensure TYPECAST_API_KEY environment variable is set
  • Verify your key at Typecast Dashboard
  • Check for extra spaces in the key
  • Confirm your transport is configured with audio_out_enabled=True
  • Check that the TTS service is included in your pipeline
  • Verify your API key has sufficient credits
  • Adjust audio_tempo within the recommended range (0.85 - 1.15)
  • Try different emotion_intensity values
  • Ensure sample rate matches your transport configuration
  • Make sure you installed pipecat-ai-typecast, not just pipecat-typecast
  • Verify Python version is 3.10 or higher
  • Check that Pipecat version is v0.0.94 or later

Resources

GitHub Repository

Source code and examples

PyPI Package

Install via pip

Pipecat Documentation

Learn more about Pipecat

Voice Library

Browse available voices