Audio Models
Plan speech-to-text, text-to-speech, realtime voice, and audio understanding API costs separately from text tokens, with USD/CNY estimates.
OpenAI · GPT-Realtime-2
$0.00
per request
Input $32/1M tokens
Output $64/1M tokens
Cache read $0.4/1M tokens
Input0%
Output0%
Cache read0%
Google · Gemini 3 Audio
$0.00
per request
Input $3/1M tokens
Output $12/1M tokens
Input0%
Output0%
Cache read0%
Google · Gemini 2.5 Flash Audio
$0.00
per request
Input $0.5/1M tokens
Output $1.5/1M tokens
Cache read $0.05/1M tokens
Input0%
Output0%
Cache read0%