Skip to content
AI

AI Model Pricing

Compare official AI model pricing for reasoning, text, audio, image, and video APIs. Toggle CNY/USD and view source links, update dates, and billing notes.

Prices are planning estimates only. Always verify with official provider dashboards before making budget decisions.

O
OpenAI
GPT-5.5
Input ¥36.00/1M tokens
Output ¥216.00/1M tokens
Cache read ¥3.60/1M tokens

Standard processing rates for context < 270K. Batch -50%, data residency +10%.

2026-05-28
O
OpenAI
GPT-5.4
Input ¥18.00/1M tokens
Output ¥108.00/1M tokens
Cache read ¥1.80/1M tokens

Standard processing rates for context < 270K. Batch -50%, data residency +10%.

2026-05-28
O
OpenAI
GPT-5.4 mini
Input ¥5.40/1M tokens
Output ¥32.40/1M tokens
Cache read ¥0.5400/1M tokens

Standard processing rates for context < 270K. Batch -50%, data residency +10%.

2026-05-28
A
Anthropic
Claude Opus 4.7
Input ¥36.00/1M tokens
Output ¥180.00/1M tokens
Cache write ¥45.00/1M tokens
Cache read ¥3.60/1M tokens

Steady-state: cache read. First cold request pays cache write. Batch -50%.

2026-05-28
A
Anthropic
Claude Sonnet 4.6
Input ¥21.60/1M tokens
Output ¥108.00/1M tokens
Cache write ¥27.00/1M tokens
Cache read ¥2.16/1M tokens

Steady-state: cache read. First cold request pays cache write. Batch -50%.

2026-05-28
A
Anthropic
Claude Haiku 4.5
Input ¥7.20/1M tokens
Output ¥36.00/1M tokens
Cache write ¥9.00/1M tokens
Cache read ¥0.7200/1M tokens

Steady-state: cache read. First cold request pays cache write. Batch -50%.

2026-05-28
D
DeepSeek
DeepSeek V4 Pro
Input ¥1.01/1M tokens
Output ¥2.02/1M tokens
Cache read ¥0.0202/1M tokens

Standard pricing (not promo). DeepSeek V3→V4 upgrade: off-peak/2.5折 promo not included.

2026-05-28
M
Mistral
Mistral Medium 3.5
Input ¥10.80/1M tokens
Output ¥54.00/1M tokens

Official Mistral pricing lists Mistral Medium 3.5 at $1.50 input / $7.50 output per 1M tokens with model id mistral-medium-latest.

2026-05-28
M
Mistral
Mistral Large 3
Input ¥3.60/1M tokens
Output ¥10.80/1M tokens

Official Mistral pricing lists Mistral Large 3 at $0.50 input / $1.50 output per 1M tokens with model id mistral-large-latest.

2026-05-28
M
Mistral
Mistral Small 4
Input ¥0.7200/1M tokens
Output ¥2.16/1M tokens

Official Mistral pricing lists Mistral Small 4 at $0.10 input / $0.30 output per 1M tokens with model id mistral-small-latest.

2026-05-28
M
Mistral
Magistral Medium
Input ¥14.40/1M tokens
Output ¥36.00/1M tokens

Official Mistral pricing lists Magistral Medium at $2.00 input / $5.00 output per 1M tokens with model id magistral-medium-latest.

2026-05-28
M
Mistral
Magistral Small
Input ¥3.60/1M tokens
Output ¥10.80/1M tokens

Official Mistral pricing lists Magistral Small at $0.50 input / $1.50 output per 1M tokens with model id magistral-small-latest.

2026-05-28
M
Mistral
Devstral 2
Input ¥2.88/1M tokens
Output ¥14.40/1M tokens

Official Mistral pricing lists Devstral 2 at $0.40 input / $2.00 output per 1M tokens with model id devstral-medium-latest.

2026-05-28
M
Mistral
Devstral Small 2
Input ¥0.7200/1M tokens
Output ¥2.16/1M tokens

Official Mistral pricing lists Devstral Small 2 at $0.10 input / $0.30 output per 1M tokens with model id devstral-small-latest.

2026-05-28
M
Mistral
Codestral
Input ¥2.16/1M tokens
Output ¥6.48/1M tokens

Official Mistral pricing lists Codestral at $0.30 input / $0.90 output per 1M tokens with model id codestral-latest.

2026-05-28
M
Mistral
Ministral 3 3B
Input ¥0.7200/1M tokens
Output ¥0.7200/1M tokens

Official Mistral pricing lists Ministral 3 - 3B at $0.10 input / $0.10 output per 1M tokens with model id ministral-3b-latest.

2026-05-28
M
Mistral
Ministral 3 8B
Input ¥1.08/1M tokens
Output ¥1.08/1M tokens

Official Mistral pricing lists Ministral 3 - 8B at $0.15 input / $0.15 output per 1M tokens with model id ministral-8b-latest.

2026-05-28
M
MiniMax
MiniMax-M2.7
Input ¥2.10/1M tokens
Output ¥8.40/1M tokens
Cache write ¥2.63/1M tokens
Cache read ¥0.4200/1M tokens

Official MiniMax pay-as-you-go pricing lists MiniMax-M2.7 at ¥2.1 input / ¥8.4 output / ¥0.42 cache read / ¥2.625 cache write per 1M tokens.

2026-05-28
M
MiniMax
MiniMax-M2.7-highspeed
Input ¥4.20/1M tokens
Output ¥16.80/1M tokens
Cache write ¥2.63/1M tokens
Cache read ¥0.4200/1M tokens

Official MiniMax pay-as-you-go pricing lists MiniMax-M2.7-highspeed at ¥4.2 input / ¥16.8 output / ¥0.42 cache read / ¥2.625 cache write per 1M tokens.

2026-05-28
M
MiniMax
MiniMax-M2.5
Input ¥2.10/1M tokens
Output ¥8.40/1M tokens
Cache write ¥2.63/1M tokens
Cache read ¥0.2100/1M tokens

Official MiniMax pay-as-you-go pricing lists MiniMax-M2.5 at ¥2.1 input / ¥8.4 output / ¥0.21 cache read / ¥2.625 cache write per 1M tokens.

2026-05-28
M
MiniMax
MiniMax-M2.5-highspeed
Input ¥4.20/1M tokens
Output ¥16.80/1M tokens
Cache write ¥2.63/1M tokens
Cache read ¥0.2100/1M tokens

Official MiniMax pay-as-you-go pricing lists MiniMax-M2.5-highspeed at ¥4.2 input / ¥16.8 output / ¥0.21 cache read / ¥2.625 cache write per 1M tokens.

2026-05-28
M
MiniMax
M2-her
Input ¥2.10/1M tokens
Output ¥8.40/1M tokens

Official MiniMax pay-as-you-go pricing lists M2-her at ¥2.1 input / ¥8.4 output per 1M tokens; cache read/write fields are not listed for this row.

2026-05-28
Z
Zhipu
GLM-5.1
Input ¥6.00/1M tokens
Output ¥24.00/1M tokens
Cache read ¥1.30/1M tokens

Official Zhipu API pricing lists GLM-5.1 tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.

2026-05-28
Z
Zhipu
GLM-5-Turbo
Input ¥5.00/1M tokens
Output ¥22.00/1M tokens
Cache read ¥1.20/1M tokens

Official Zhipu API pricing lists GLM-5-Turbo tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.

2026-05-28
Z
Zhipu
GLM-5
Input ¥4.00/1M tokens
Output ¥18.00/1M tokens
Cache read ¥1.00/1M tokens

Official Zhipu API pricing lists GLM-5 tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.

2026-05-28
Q
Qwen
qwen3.7-max
Input ¥12.00/1M tokens
Output ¥36.00/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen3.7-max at ¥12 input / ¥36 output per 1M tokens for the 0<Token≤1M tier.

2026-05-28
Q
Qwen
qwen3-max
Input ¥2.50/1M tokens
Output ¥10.00/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen3-max with tiered token pricing; the first tier is used as the card default and tiers stores the full official schedule.

2026-05-28
Q
Qwen
qwen-turbo
Input ¥0.3000/1M tokens
Output ¥0.6000/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen-turbo at ¥0.30 input / ¥0.60 output per 1M tokens for China mainland deployment.

2026-05-28
Q
Qwen
qwen-plus
Input ¥0.8000/1M tokens
Output ¥2.00/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen-plus at ¥0.80 input / ¥2.00 output per 1M tokens for the 0-128K tier in China mainland deployment.

2026-05-28
Q
Qwen
qwen-max
Input ¥2.40/1M tokens
Output ¥9.60/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen-max at ¥2.40 input / ¥9.60 output per 1M tokens in China mainland deployment.

2026-05-28
M
Moonshot
kimi-k2.6
Input ¥6.50/1M tokens
Output ¥27.00/1M tokens
Cache read ¥1.10/1M tokens

Official Kimi pricing lists kimi-k2.6 at ¥1.10 cache-hit input / ¥6.50 cache-miss input / ¥27.00 output per 1M tokens.

2026-05-28
M
Moonshot
kimi-k2.5
Input ¥4.00/1M tokens
Output ¥21.00/1M tokens
Cache read ¥0.7000/1M tokens

Official Kimi pricing lists kimi-k2.5 at ¥0.70 cache-hit input / ¥4.00 cache-miss input / ¥21.00 output per 1M tokens.

2026-05-28
M
Moonshot
moonshot-v1-32k
Input ¥5.00/1M tokens
Output ¥20.00/1M tokens

Official Moonshot pricing lists moonshot-v1-32k at ¥5.00 input / ¥20.00 output per 1M tokens.

2026-05-28
O
OpenAI
GPT-Realtime-2
Input ¥230.40/1M tokens
Output ¥460.80/1M tokens
Cache read ¥2.88/1M tokens

Audio token pricing for voice interactions. Text modality billed at separate text rates.

2026-05-28
G
Google
Gemini 3 Audio
Input ¥21.60/1M tokens
Output ¥86.40/1M tokens

Paid-tier Gemini 3 audio token pricing. Official page also lists per-minute equivalents for some audio workloads.

2026-05-28
G
Google
Gemini 2.5 Flash Audio
Input ¥3.60/1M tokens
Output ¥10.80/1M tokens
Cache read ¥0.3600/1M tokens

Gemini 2.5 Flash audio input pricing, with output billed at the model output-token rate.

2026-05-28
Z
Zhipu
GLM-4-Voice
Input ¥80.00/1M tokens
Output ¥80.00/1M tokens

Official Zhipu pricing lists GLM-4-Voice audio at ¥80 per 1M Tokens; the card keeps the official token unit without minute conversion.

2026-05-28
O
OpenAI
GPT-Image-2
Input ¥57.60/1M tokens
Output ¥216.00/1M tokens
Cache read ¥14.40/1M tokens

Image token pricing for GPT-Image generation. Text prompts are billed at separate text-token rates.

2026-05-28
G
Google
Gemini 3 Image/Video
Input ¥7.20/1M tokens
Output ¥32.40/1M tokens

Paid-tier Gemini 3 image/video input pricing with output billed at the text output-token rate.

2026-05-28
G
Google
Gemini 2.5 Flash Image/Video
Input ¥1.80/1M tokens
Output ¥10.80/1M tokens
Cache read ¥0.1800/1M tokens

Gemini 2.5 Flash image/video input pricing, with output billed at the model output-token rate.

2026-05-28
Q
Qwen
qwen-vl-ocr
Input ¥0.5140/1M tokens
Output ¥1.17/1M tokens

Official Alibaba Cloud Model Studio pricing lists qwen-vl-ocr vision OCR token pricing; per-image generation rows are not converted into token prices.

2026-05-28
M
Moonshot
moonshot-v1-8k-vision-preview
Input ¥2.00/1M tokens
Output ¥10.00/1M tokens

Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.

2026-05-28
M
Moonshot
moonshot-v1-32k-vision-preview
Input ¥5.00/1M tokens
Output ¥20.00/1M tokens

Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.

2026-05-28
M
Moonshot
moonshot-v1-128k-vision-preview
Input ¥10.00/1M tokens
Output ¥30.00/1M tokens

Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.

2026-05-28
G
Google
Gemini 3 Image/Video
Input ¥7.20/1M tokens
Output ¥32.40/1M tokens

Paid-tier Gemini 3 video input pricing. Output is billed at the model text output-token rate; no seconds or minutes conversion is applied.

2026-05-28
G
Google
Gemini 2.5 Flash Image/Video
Input ¥1.80/1M tokens
Output ¥10.80/1M tokens
Cache read ¥0.1800/1M tokens

Gemini 2.5 Flash video input pricing with context cache reads where supported. Output is billed at the model output-token rate.

2026-05-28
M
Moonshot
kimi-k2.6-video-input
Input ¥6.50/1M tokens
Output ¥27.00/1M tokens
Cache read ¥1.10/1M tokens

Official Kimi K2.6 supports video input under token pricing. No seconds or minutes conversion is applied.

2026-05-28
M
Moonshot
kimi-k2.5-video-input
Input ¥4.00/1M tokens
Output ¥21.00/1M tokens
Cache read ¥0.7000/1M tokens

Official Kimi K2.5 supports video input under token pricing. No seconds or minutes conversion is applied.

2026-05-28