S2S vs Pipeline — voice agent cost calculator

Model your monthly voice agent bill by session length, volume, and architecture. Prices verified April 2026.

Session length
5 min
Monthly volume
100k
S2S model
Pipeline tier
S2S Monthly
$190k
$0.38/min at 5 min
Pipeline Monthly
$50k
$0.10/min flat
Difference
+$140k
S2S costs more · 3.8x ratio
S2S cost/min (solid)
Pipeline cost/min (dashed)
Your session
At 5-minute sessions, gpt-realtime costs 3.8x more per minute than the Standard pipeline. At 100k calls/month that's $140k extra monthly, $1.68M annually.
How the numbers are calculated

S2S per-minute cost rises with session length because audio tokens from all prior turns are re-billed on every turn — this is documented in Google's Vertex Live API spec and confirmed by OpenAI developer community reports. We model this with a quadratic-with-cap curve calibrated against published token rates and independent benchmarks: gpt-realtime runs roughly $0.24/min at 2 minutes, $1.00/min at 15 minutes, and reaches context window limits around $1.50/min for 30-minute sessions. Sources: OpenAI pricing page, Google AI pricing page.

Pipeline per-minute cost is flat because streaming ASR and TTS are billed per audio minute, and only compact text context accumulates for the LLM. Tiers: Economy ~$0.05/min, Standard ~$0.10/min, Premium ~$0.18/min — based on Deepgram Nova-3 ($0.0077/min), GPT-4o-mini / Gemini Flash-Lite / GPT-4o text rates, and Aura-2 / ElevenLabs Flash TTS pricing.

Excludes: telephony/SIP, orchestration platform markup, network/transport costs. These apply to both architectures equally and do not change the comparison. Approximation is deliberate: this tool is for sizing, not invoicing.

ConvoAI · Agora · April 2026