Curlscape logo

LLM Pricing Calculator

Compare API pricing across OpenAI, Anthropic, Google, Mistral, and more. Estimate self-hosting costs and find your break-even point.

Last updated: February 17, 2026

Pricing data last verified: 2026-02-1747 API Models14 GPU Options
1.0M tokens/mo
100K1M10M100MCustom

1.0M tokens ≈ 750K words ≈ 1,500 pages

Not sure? Industry standard is 0.7:0.3
70% / 30%
More Input (lower cost)More Output (higher cost)
|

Compare your own model pricing

API vs Self-Hosting: Cost Comparison

Compare the total cost of using an API provider versus hosting an open-source model yourself. Select models from the API pricing table to compare, or choose below.

Self-Hosting Configuration

1.0M tokens/mo

From your usage settings above

GPU Calculation

Precision

FP16

Throughput per GPU

330 tok/s

Capacity per GPU

855M/mo

GPUs Needed

1(1.0M ÷ 855M = 0.00)

VRAM per GPU

14 GB / 80 GB

Cheapest: Vast.ai A100 SXM @ $0.22/hr

$158.40/mo (GPU only)

Throughput: ~330 tok/s aggregate (vLLM, BS8). Varies with config.

Self-Hosting Overhead

$

Infrastructure setup, engineering time

$/mo

Prompt management, monitoring, maintenance, on-call

Self-Hosted GPU

$158.40

+ Ops Overhead

$0.0000

Total Self-Hosted

$158.40/mo

Cheapest API

$0.07/mo

Command R7B

Cumulative Cost Over 12 Months

Showing top 5 cheapest API models vs self-hosted 7B.

Solid lines: API providersDashed green: Self-hosted (7B, 1 GPU)Self-hosted includes $5,000 setup cost

Key Considerations

Latency

APILow, managed by provider
SelfLowest possible, full control

Data Privacy

APIData sent to third party
SelfFull data sovereignty

Customization

APILimited to provider options
SelfFine-tuning, custom models

Reliability

API99.9%+ SLA from provider
SelfYou manage uptime

Disclaimer

Prices are approximate and based on publicly available pricing pages as of 2026-02-17. Actual costs may vary based on volume discounts, reserved pricing, and provider-specific terms. Always verify current pricing on provider websites before making purchasing decisions.

Frequently Asked Questions

How much does it cost to use the OpenAI API?

OpenAI API pricing varies by model. GPT-4o costs $2.50 per million input tokens and $10 per million output tokens. GPT-4o-mini is much cheaper at $0.15/$0.60. For a typical chatbot processing 1 million tokens/month, expect $5-15/month with GPT-4o.

Is it cheaper to self-host an LLM or use an API?

For low-to-medium usage (under 50M tokens/month), APIs are typically cheaper. Self-hosting becomes cost-effective at high volumes where GPU costs are spread over millions of tokens. Use our calculator to find your specific break-even point.

What is the cheapest LLM API in 2026?

The cheapest LLM APIs in 2026 are Google Gemini 2.0 Flash ($0.10/$0.40 per million tokens), Groq's Llama 3.1 8B ($0.06/$0.06), and Mistral Small ($0.10/$0.30). However, the best value depends on your quality requirements and use case.

How do I calculate LLM API costs for my project?

Estimate your monthly token usage (1 token ≈ 0.75 words), split between input and output. Multiply by the per-million-token price. For example, 5M tokens/month at 60/40 input/output on GPT-4o = (3M × $2.50 + 2M × $10) / 1M = $27.50/month.

Related Resources

Get in Touch