Skip to content
AI

Audio Models

Plan speech-to-text, text-to-speech, realtime voice, and audio understanding API costs separately from text tokens, with USD/CNY estimates.

OpenAI · GPT-Realtime-2

$0.00

per request

Input $32/1M tokens
Output $64/1M tokens
Cache read $0.4/1M tokens
Input0%
Output0%
Cache read0%

Google · Gemini 3 Audio

$0.00

per request

Input $3/1M tokens
Output $12/1M tokens
Input0%
Output0%
Cache read0%

Google · Gemini 2.5 Flash Audio

$0.00

per request

Input $0.5/1M tokens
Output $1.5/1M tokens
Cache read $0.05/1M tokens
Input0%
Output0%
Cache read0%