LLM Pricing Calculator
Compare API pricing across OpenAI, Anthropic, Google, Mistral, and more. Estimate self-hosting costs and find your break-even point.
Last updated: February 17, 2026
1.0M tokens ≈ 750K words ≈ 1,500 pages
Compare your own model pricing
| Provider / Model | Tier | Monthly Cost | $/1M Tokens | Context | Capabilities | |
|---|---|---|---|---|---|---|
Cohere3 models | from $0.07 | from $0.07 | ||||
Google7 models | from $0.14 | from $0.14 | ||||
OpenAI15 models | from $0.16 | from $0.16 | ||||
Mistral7 models | from $0.16 | from $0.16 | ||||
xAI4 models | from $0.29 | from $0.29 | ||||
DeepSeek2 models | from $0.32 | from $0.32 | ||||
Anthropic9 models | from $0.55 | from $0.55 | ||||
API vs Self-Hosting: Cost Comparison
Compare the total cost of using an API provider versus hosting an open-source model yourself. Select models from the API pricing table to compare, or choose below.
Self-Hosting Configuration
1.0M tokens/mo
From your usage settings above
GPU Calculation
Precision
FP16
Throughput per GPU
330 tok/s
Capacity per GPU
855M/mo
GPUs Needed
1(1.0M ÷ 855M = 0.00)
VRAM per GPU
14 GB / 80 GB
Cheapest: Vast.ai A100 SXM @ $0.22/hr
$158.40/mo (GPU only)
Throughput: ~330 tok/s aggregate (vLLM, BS8). Varies with config.
Self-Hosting Overhead
Infrastructure setup, engineering time
Prompt management, monitoring, maintenance, on-call
Self-Hosted GPU
$158.40
+ Ops Overhead
$0.0000
Total Self-Hosted
$158.40/mo
Cheapest API
$0.07/mo
Command R7B
Cumulative Cost Over 12 Months
Showing top 5 cheapest API models vs self-hosted 7B.
Key Considerations
Latency
Data Privacy
Customization
Reliability
Disclaimer
Prices are approximate and based on publicly available pricing pages as of 2026-02-17. Actual costs may vary based on volume discounts, reserved pricing, and provider-specific terms. Always verify current pricing on provider websites before making purchasing decisions.
Frequently Asked Questions
How much does it cost to use the OpenAI API?▼
OpenAI API pricing varies by model. GPT-4o costs $2.50 per million input tokens and $10 per million output tokens. GPT-4o-mini is much cheaper at $0.15/$0.60. For a typical chatbot processing 1 million tokens/month, expect $5-15/month with GPT-4o.
Is it cheaper to self-host an LLM or use an API?▼
For low-to-medium usage (under 50M tokens/month), APIs are typically cheaper. Self-hosting becomes cost-effective at high volumes where GPU costs are spread over millions of tokens. Use our calculator to find your specific break-even point.
What is the cheapest LLM API in 2026?▼
The cheapest LLM APIs in 2026 are Google Gemini 2.0 Flash ($0.10/$0.40 per million tokens), Groq's Llama 3.1 8B ($0.06/$0.06), and Mistral Small ($0.10/$0.30). However, the best value depends on your quality requirements and use case.
How do I calculate LLM API costs for my project?▼
Estimate your monthly token usage (1 token ≈ 0.75 words), split between input and output. Multiply by the per-million-token price. For example, 5M tokens/month at 60/40 input/output on GPT-4o = (3M × $2.50 + 2M × $10) / 1M = $27.50/month.