AI Model Pricing
Compare official AI model pricing for reasoning, text, audio, image, and video APIs. Toggle CNY/USD and view source links, update dates, and billing notes.
Prices are planning estimates only. Always verify with official provider dashboards before making budget decisions.
Standard processing rates for context < 270K. Batch -50%, data residency +10%.
Standard processing rates for context < 270K. Batch -50%, data residency +10%.
Standard processing rates for context < 270K. Batch -50%, data residency +10%.
Steady-state: cache read. First cold request pays cache write. Batch -50%.
Steady-state: cache read. First cold request pays cache write. Batch -50%.
Steady-state: cache read. First cold request pays cache write. Batch -50%.
Standard pricing (not promo). DeepSeek V3→V4 upgrade: off-peak/2.5折 promo not included.
Official Mistral pricing lists Mistral Medium 3.5 at $1.50 input / $7.50 output per 1M tokens with model id mistral-medium-latest.
Official Mistral pricing lists Mistral Large 3 at $0.50 input / $1.50 output per 1M tokens with model id mistral-large-latest.
Official Mistral pricing lists Mistral Small 4 at $0.10 input / $0.30 output per 1M tokens with model id mistral-small-latest.
Official Mistral pricing lists Magistral Medium at $2.00 input / $5.00 output per 1M tokens with model id magistral-medium-latest.
Official Mistral pricing lists Magistral Small at $0.50 input / $1.50 output per 1M tokens with model id magistral-small-latest.
Official Mistral pricing lists Devstral 2 at $0.40 input / $2.00 output per 1M tokens with model id devstral-medium-latest.
Official Mistral pricing lists Devstral Small 2 at $0.10 input / $0.30 output per 1M tokens with model id devstral-small-latest.
Official Mistral pricing lists Codestral at $0.30 input / $0.90 output per 1M tokens with model id codestral-latest.
Official Mistral pricing lists Ministral 3 - 3B at $0.10 input / $0.10 output per 1M tokens with model id ministral-3b-latest.
Official Mistral pricing lists Ministral 3 - 8B at $0.15 input / $0.15 output per 1M tokens with model id ministral-8b-latest.
Official MiniMax pay-as-you-go pricing lists MiniMax-M2.7 at ¥2.1 input / ¥8.4 output / ¥0.42 cache read / ¥2.625 cache write per 1M tokens.
Official MiniMax pay-as-you-go pricing lists MiniMax-M2.7-highspeed at ¥4.2 input / ¥16.8 output / ¥0.42 cache read / ¥2.625 cache write per 1M tokens.
Official MiniMax pay-as-you-go pricing lists MiniMax-M2.5 at ¥2.1 input / ¥8.4 output / ¥0.21 cache read / ¥2.625 cache write per 1M tokens.
Official MiniMax pay-as-you-go pricing lists MiniMax-M2.5-highspeed at ¥4.2 input / ¥16.8 output / ¥0.21 cache read / ¥2.625 cache write per 1M tokens.
Official MiniMax pay-as-you-go pricing lists M2-her at ¥2.1 input / ¥8.4 output per 1M tokens; cache read/write fields are not listed for this row.
Official Zhipu API pricing lists GLM-5.1 tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.
Official Zhipu API pricing lists GLM-5-Turbo tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.
Official Zhipu API pricing lists GLM-5 tiered by input length; the first tier is used as the card default and tiers stores the full official schedule.
Official Alibaba Cloud Model Studio pricing lists qwen3.7-max at ¥12 input / ¥36 output per 1M tokens for the 0<Token≤1M tier.
Official Alibaba Cloud Model Studio pricing lists qwen3-max with tiered token pricing; the first tier is used as the card default and tiers stores the full official schedule.
Official Alibaba Cloud Model Studio pricing lists qwen-turbo at ¥0.30 input / ¥0.60 output per 1M tokens for China mainland deployment.
Official Alibaba Cloud Model Studio pricing lists qwen-plus at ¥0.80 input / ¥2.00 output per 1M tokens for the 0-128K tier in China mainland deployment.
Official Alibaba Cloud Model Studio pricing lists qwen-max at ¥2.40 input / ¥9.60 output per 1M tokens in China mainland deployment.
Official Kimi pricing lists kimi-k2.6 at ¥1.10 cache-hit input / ¥6.50 cache-miss input / ¥27.00 output per 1M tokens.
Official Kimi pricing lists kimi-k2.5 at ¥0.70 cache-hit input / ¥4.00 cache-miss input / ¥21.00 output per 1M tokens.
Official Moonshot pricing lists moonshot-v1-32k at ¥5.00 input / ¥20.00 output per 1M tokens.
Audio token pricing for voice interactions. Text modality billed at separate text rates.
Paid-tier Gemini 3 audio token pricing. Official page also lists per-minute equivalents for some audio workloads.
Gemini 2.5 Flash audio input pricing, with output billed at the model output-token rate.
Official Zhipu pricing lists GLM-4-Voice audio at ¥80 per 1M Tokens; the card keeps the official token unit without minute conversion.
Image token pricing for GPT-Image generation. Text prompts are billed at separate text-token rates.
Paid-tier Gemini 3 image/video input pricing with output billed at the text output-token rate.
Gemini 2.5 Flash image/video input pricing, with output billed at the model output-token rate.
Official Alibaba Cloud Model Studio pricing lists qwen-vl-ocr vision OCR token pricing; per-image generation rows are not converted into token prices.
Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.
Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.
Official Moonshot V1 vision preview pricing. Image input is billed as token pricing; no per-image conversion is applied.
Paid-tier Gemini 3 video input pricing. Output is billed at the model text output-token rate; no seconds or minutes conversion is applied.
Gemini 2.5 Flash video input pricing with context cache reads where supported. Output is billed at the model output-token rate.
Official Kimi K2.6 supports video input under token pricing. No seconds or minutes conversion is applied.
No results found