LLM Pricing Calculator

Compare API pricing across OpenAI, Anthropic, Google, Mistral, and more. Estimate self-hosting costs and find your break-even point.

Last updated: February 17, 2026

Pricing data last verified: 2026-02-1747 API Models14 GPU Options

Use Case Preset

Token Volume

1.0M tokens/mo

100K1M10M100MCustom

1.0M tokens ≈ 750K words ≈ 1,500 pages

Input / Output RatioNot sure? Industry standard is 0.7:0.3

70% / 30%

More Input (lower cost)More Output (higher cost)

Filter by accuracy tier:

Compare your own model pricing

$/1M input tokens

$/1M output tokens

Provider / Model	Monthly Cost	$/1M Tokens
Cohere3 models	from $0.07	from $0.07
Google7 models	from $0.14	from $0.14
OpenAI15 models	from $0.16	from $0.16
Mistral7 models	from $0.16	from $0.16
xAI4 models	from $0.29	from $0.29
DeepSeek2 models	from $0.32	from $0.32
Anthropic9 models	from $0.55	from $0.55

API vs Self-Hosting: Cost Comparison

Compare the total cost of using an API provider versus hosting an open-source model yourself. Select models from the API pricing table to compare, or choose below.

Self-Hosting Configuration

Model Size

Token Demand

1.0M tokens/mo

From your usage settings above

GPU Calculation

Precision

FP16

Throughput per GPU

330 tok/s

Capacity per GPU

855M/mo

GPUs Needed

1(1.0M ÷ 855M = 0.00)

VRAM per GPU

14 GB / 80 GB

Cheapest: Vast.ai A100 SXM @ $0.22/hr

$158.40/mo (GPU only)

Throughput: ~330 tok/s aggregate (vLLM, BS8). Varies with config.

Self-Hosting Overhead

Initial Setup Cost

Infrastructure setup, engineering time

Operations Overhead

$/mo

Prompt management, monitoring, maintenance, on-call

Self-Hosted GPU

$158.40

+ Ops Overhead

$0.0000

Total Self-Hosted

$158.40/mo

Cheapest API

$0.07/mo

Command R7B

Cumulative Cost Over 12 Months

Showing top 5 cheapest API models vs self-hosted 7B.

Solid lines: API providersDashed green: Self-hosted (7B, 1 GPU)Self-hosted includes $5,000 setup cost

Key Considerations

Latency

APILow, managed by provider

SelfLowest possible, full control

Data Privacy

APIData sent to third party

SelfFull data sovereignty

Customization

APILimited to provider options

SelfFine-tuning, custom models

Reliability

API99.9%+ SLA from provider

SelfYou manage uptime

Disclaimer

Prices are approximate and based on publicly available pricing pages as of 2026-02-17. Actual costs may vary based on volume discounts, reserved pricing, and provider-specific terms. Always verify current pricing on provider websites before making purchasing decisions.

Frequently Asked Questions

How much does it cost to use the OpenAI API?▼

OpenAI API pricing varies by model. GPT-4o costs $2.50 per million input tokens and $10 per million output tokens. GPT-4o-mini is much cheaper at $0.15/$0.60. For a typical chatbot processing 1 million tokens/month, expect $5-15/month with GPT-4o.

Is it cheaper to self-host an LLM or use an API?▼

For low-to-medium usage (under 50M tokens/month), APIs are typically cheaper. Self-hosting becomes cost-effective at high volumes where GPU costs are spread over millions of tokens. Use our calculator to find your specific break-even point.

What is the cheapest LLM API in 2026?▼

The cheapest LLM APIs in 2026 are Google Gemini 2.0 Flash ($0.10/$0.40 per million tokens), Groq's Llama 3.1 8B ($0.06/$0.06), and Mistral Small ($0.10/$0.30). However, the best value depends on your quality requirements and use case.

How do I calculate LLM API costs for my project?▼

Estimate your monthly token usage (1 token ≈ 0.75 words), split between input and output. Multiply by the per-million-token price. For example, 5M tokens/month at 60/40 input/output on GPT-4o = (3M × $2.50 + 2M × $10) / 1M = $27.50/month.

Related Resources

LLM Pricing Explained: OpenAI vs Gemini vs Open Source Models →More AI Engineering Articles →