AI API Cost Calculator
Estimate OpenAI, Claude, Gemini and DeepSeek API costs before you launch. Model prompt caching, compare mainstream frontier and low-cost models, test different request volumes, and turn token usage into a practical monthly budget your team can review before shipping.
Choose a calculator by model type
Reasoning, text, audio, image, and video models use different billing patterns. Separate pages keep official units, cache pricing, and multimodal cost assumptions visible.
Reasoning Models
Estimate agents, coding, reasoning, and long-context costs.
Open pageText Models
Estimate chat, summaries, translation, RAG, and batch text costs.
Open pageAudio Models
Estimate speech, realtime voice, and audio understanding costs.
Open pageImage Models
Estimate image generation, vision, and multimodal image costs.
Open pageVideo Models
Estimate video understanding and multimodal media costs.
Open pageAI Model Pricing
Full pricing table with per-token rates and cache pricing. Updated from official sources.
Latest guides
Guides for token budgeting, cache strategy, model selection, and cost optimization.
7 Practical Ways to Reduce AI API Costs
Reduce AI API costs with seven practical methods: shorten context, control output length, use caching, route models, batch offline work, set quotas, and monitor abnormal requests.
Read guideAI App Token Budget Template: What to Fill Before Launch
Use a practical token budget template to estimate request volume, input tokens, output tokens, cache ratio, model pricing, and safety margin before launching an AI application.
Read guideHow to Choose a Low-Cost AI Model Without Losing Quality
Choose a low-cost AI model by comparing task type, input and output length, context requirements, cache support, and failure cost across Claude, GPT, Gemini, DeepSeek, and similar providers.
Read guideFAQ
Are these prices accurate?
Prices are verified against official provider pricing pages. DeepSeek prices include the current 75% discount promotion. Always confirm with official dashboards for production budgets.
Does this send my prompts anywhere?
No. All calculations run in your browser. Your token counts and prompts never leave your device.
What does "cache hit" mean?
Cache hit tokens are input tokens that match a previously cached prefix. Providers charge much less for cache hits — for example, DeepSeek charges ¥0.025/1M for cache hits vs ¥3/1M for cache misses.
Why do different models show different numbers of bars?
Each provider has a different pricing model. DeepSeek has 3 components (miss/hit/output). Anthropic adds a separate cache write cost. The bars reflect each model's actual billing structure.
How do I report incorrect pricing?
Use the Report Pricing Error page to flag incorrect prices. We update data from official sources and welcome corrections.