Timestamps & captions - Typecast Documentation

The CLI can call Typecast Timestamp TTS and save alignment data alongside the generated audio. Use this when an agent needs subtitles for Shorts, caption timing for social video, karaoke-style highlights, or lip-sync metadata.

Generate subtitles

# Save audio and SRT subtitles
cast "Hello, world. This is a test." \
  --out hello.wav \
  --timestamps-out hello.srt

# Save audio and WebVTT subtitles
cast "Hello, world. This is a test." \
  --out hello.wav \
  --timestamps-out hello.vtt \
  --timestamps-format vtt

When --timestamps-format is omitted, CLI infers srt or vtt from the --timestamps-out extension and falls back to json.

Save raw timestamp JSON

cast "Hello, world. This is a test." \
  --out hello.wav \
  --timestamps-out hello.timestamps.json

JSON is useful when another tool will create captions, animate text, or align visuals manually.

Choose granularity

cast "Hello, world." \
  --out hello.wav \
  --timestamps-out hello.srt \
  --granularity both

For languages without whitespace between words, such as Japanese (jpn) or Chinese (zho), use character-level timestamps for usable subtitle timing:

cast "こんにちは。世界。" \
  --language jpn \
  --out hello.wav \
  --timestamps-out hello.srt

Caption workflow for agents

Create narration audio and captions from script.txt.
Use the CLI.
Write audio to ./video/voiceover.wav.
Write subtitles to ./video/voiceover.srt.
Keep the subtitle file next to the audio file.

Output choices

Output	Use when
`.srt`	Video editors, Shorts/Reels/TikTok caption import
`.vtt`	Web video players and browser-based previews
`.json`	Custom rendering, karaoke highlights, lip-sync, downstream automation

For social video, generate captions in the same step as audio. It keeps the final narration and subtitle timing tied to the exact same synthesis result.

​Generate subtitles

​Save raw timestamp JSON

​Choose granularity

​Caption workflow for agents

​Output choices

Generate subtitles

Save raw timestamp JSON

Choose granularity

Caption workflow for agents

Output choices