AI token pricing
AI Token Pricing Guide
LLM API bills usually separate input tokens, output tokens, cached input, reasoning tokens, and add-ons such as search, grounding, tools, images, audio, and storage.
PromptPrice gives a quick estimate from visible text and a closer billing-grade estimate when you enter provider token counts or API usage JSON.
Quick estimate
No API key needed. Pick a model, paste your question, choose the answer length, and get a close cost estimate.
3. How long will the AI answer be?
Your estimate will appear here.
Paste a prompt, choose an answer length, and calculate.
Provider pricing pages
OpenAI
Includes text models with cached input support on selected tiers.
Anthropic
Prompt caching and batch behavior varies by model.
Google Gemini
Long context and modality pricing can differ by product surface.
DeepSeek
Cache hit and cache miss input pricing are listed separately.
Mistral
Some tools and document features are non-token priced.
Groq
High-throughput hosted models with provider-specific add-ons.
Together AI
Serverless token pricing and dedicated deployment pricing differ.
Fireworks AI
Serverless, batch, and priority deployments can have different rates.
xAI
Reasoning tokens and server-side tools can add billed usage.
Cohere
Some current models use free-until-limit or product-specific pricing.
Perplexity
Sonar models add search-context request fees and, for Deep Research, citation/reasoning/search-query fees.
OpenRouter
Router-native entries are included first; most OpenRouter model costs are pass-through per model/provider route.
Cerebras
High-speed inference models with public model metadata and pay-per-token pricing.
Is a pasted prompt estimate exact?
No. Pasted text uses estimation unless you enter provider-reported token counts or API usage JSON.
Why do invoices differ from visible text cost?
Tools, cache writes, search, grounding, long-context tiers, hidden system prompts, and region multipliers can change the final bill.