Scope
This comparison covers 17 leading AI model APIs. Data is sourced from official provider pricing pages.
| Provider | Model | Input ($/1M) | Output ($/1M) | Cached Input ($/1M) |
|---|---|---|---|---|
| DeepSeek | V4 Pro | $0.14 | $0.28 | $0.014 |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | $0.30 |
| Anthropic | Claude Opus 4.7 | $10.00 | $50.00 | $1.00 |
| OpenAI | GPT-5.4 Mini | $0.075 | $0.225 | — |
| OpenAI | GPT-5.4 | $4.00 | $12.00 | — |
| Gemini 2.5 Flash | $0.20 | $0.60 | — | |
| Gemini 2.5 Pro | $2.00 | $6.00 | — |
Prices in USD. For CNY pricing, switch to the Chinese version. See all 17 models in the full pricing table.
Best Value: DeepSeek V4 Pro
DeepSeek V4 Pro delivers top-tier performance at a fraction of the cost — input price is just 4.7% of Claude Sonnet 4.6, and output is 1.9%. With caching enabled, input cost drops further to $0.014/1M, making it extremely cost-effective.
Best for: High-frequency calls, cost-sensitive production environments, Chinese content generation.
Balanced Choice: Claude Sonnet 4.6
Claude Sonnet 4.6 is one of the strongest all-around models today, excelling at code generation, reasoning, and long-context processing. While 20x more expensive than DeepSeek, it’s 70% cheaper than Opus 4.7, making it the sweet spot for most projects.
Best for: Coding assistants, complex reasoning tasks, applications where quality matters.
Budget Pick: GPT-5.4 Mini
GPT-5.4 Mini comes in at just $0.075/1M input — the lowest among all models compared. Output is $0.225/1M. While not as capable as flagship models, it’s more than sufficient for simple tasks like classification, summarization, and translation.
Best for: Large-scale text processing, simple classification, cost-first batch jobs.
How Much Can Caching Save?
Assuming 1,000 requests/day at 20K input + 5K output each:
| Model | No Cache (¥/day) | 90% Hit Rate (¥/day) | Savings |
|---|---|---|---|
| DeepSeek V4 Pro | ¥30.2 | ¥6.04 | 80% |
| Claude Sonnet 4.6 | ¥648 | ¥129.6 | 80% |
| GPT-5.4 Mini | ¥16.2 | ¥16.2 | No caching |
OpenAI models don’t support prompt caching yet, which narrows their cost advantage in high-frequency scenarios.
Which Model Should You Choose?
| Your Need | Recommended Model |
|---|---|
| Cost first, lots of simple tasks | GPT-5.4 Mini |
| Chinese scenarios, best value | DeepSeek V4 Pro |
| Code/reasoning, need quality | Claude Sonnet 4.6 |
| Maximum capability, budget no issue | Claude Opus 4.7 |
| Multimodal (image + text) | GPT-5.4 / Gemini 2.5 Pro |
Verify with Your Own Numbers
Everyone’s usage pattern is different. The numbers above are just a reference. Open the text model calculator and plug in your actual token volumes to see which model fits your use case.