List Formats

Get all available audio output formats

The formats endpoint returns a list of all supported audio output formats.

Endpoint

GET /v1/talk/formats

Authentication

Requires a do_live_* API key with the talk:formats scope.

Request Example

curl -H "Authorization: Bearer do_live_your_key_here" \
  "https://api.do.dev/v1/talk/formats"

Response

Success (200 OK)

{
  "formats": [
    {
      "id": "mp3",
      "contentType": "audio/mpeg"
    },
    {
      "id": "wav",
      "contentType": "audio/wav"
    },
    {
      "id": "flac",
      "contentType": "audio/flac"
    },
    {
      "id": "aac",
      "contentType": "audio/aac"
    },
    {
      "id": "opus",
      "contentType": "audio/opus"
    }
  ]
}

Format Reference

FormatContent-TypeCompressionQualityBest For
mp3audio/mpegLossyGoodWeb delivery, broad compatibility
wavaudio/wavNoneHighestAudio editing, archival
flacaudio/flacLosslessHighestArchival, audiophile applications
aacaudio/aacLossyVery GoodApple devices, iOS apps
opusaudio/opusLossyExcellentWebRTC, real-time streaming

Choosing the Right Format

MP3 (Default)

  • Pros: Universal compatibility, small file size
  • Cons: Lossy compression
  • Use when: Serving audio on the web, mobile apps, or when file size matters

WAV

  • Pros: Highest quality, no compression artifacts
  • Cons: Large file size
  • Use when: Audio editing, professional production, or when quality is critical

FLAC

  • Pros: Lossless compression, smaller than WAV
  • Cons: Less compatible than MP3
  • Use when: Archival, audiophile applications, or when you need lossless quality

AAC

  • Pros: Better quality than MP3 at same bitrate
  • Cons: Less universal than MP3
  • Use when: iOS/macOS apps, Apple ecosystem

Opus

  • Pros: Best quality-to-size ratio, excellent for voice
  • Cons: Requires modern browser/player
  • Use when: WebRTC, real-time applications, modern web apps

Sample Rates

You can specify a sample rate with the sampleRate parameter in the speech endpoint:

Sample RateDescription
8000 HzTelephone quality
16000 HzWideband voice
24000 HzDefault, good balance
48000 HzCD quality

Higher sample rates produce better quality audio but larger file sizes.

Code Example

async function generateHighQualityAudio(text) {
  const response = await fetch("https://api.do.dev/v1/talk/speech", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.DO_API_KEY}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      text,
      voice: "aria",
      format: "wav",
      sampleRate: 48000
    })
  });

  return response.blob();
}