OpenAI API Pricing Guide 2026: Every Model Compared

OpenAI now has 15 models available via API. That is not a typo. Fifteen. And the pricing spreads across a 300x range, from $0.05 per million input tokens (GPT-5 Nano) to $15 per million input tokens (o1).
If you are building on the OpenAI API in 2026, picking the wrong model can mean the difference between a $50/month bill and a $5,000/month bill for the same workload. This guide breaks down every model, what it actually costs in practice, and when to use each one.
All prices are per million tokens unless stated otherwise. You can plug your own numbers into our LLM Pricing Calculator to get exact estimates.
The Full OpenAI Pricing Table (February 2026)
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Speed | Best For |
| GPT-5.2 | $1.75 | $14.00 | 400K | Medium | Flagship reasoning + vision |
| GPT-5.1 | $1.25 | $10.00 | 400K | Medium | Previous-gen flagship |
| GPT-5 | $1.25 | $10.00 | 400K | Medium | Previous-gen flagship |
| GPT-5 Mini | $0.25 | $2.00 | 400K | Fast | Cost-effective general use |
| GPT-5 Nano | $0.05 | $0.40 | 400K | Very Fast | High-volume simple tasks |
| GPT-4.1 | $2.00 | $8.00 | 1M+ | Fast | Long-context workloads |
| GPT-4.1 Mini | $0.40 | $1.60 | 1M+ | Very Fast | Budget long-context |
| GPT-4.1 Nano | $0.10 | $0.40 | 1M+ | Very Fast | Cheapest long-context |
| GPT-4o | $2.50 | $10.00 | 128K | Fast | Legacy multimodal |
| GPT-4o Mini | $0.15 | $0.60 | 128K | Very Fast | Legacy budget option |
| o1 | $15.00 | $60.00 | 200K | Slow | Deep reasoning |
| o1 Mini | $1.10 | $4.40 | 128K | Medium | Budget reasoning |
| o3 | $2.00 | $8.00 | 200K | Slow | Current-gen reasoning |
| o3 Mini | $0.50 | $2.00 | 200K | Medium | Lightweight reasoning |
| o4 Mini | $1.10 | $4.40 | 200K | Medium | Latest reasoning (small) |
Understanding the Model Families
OpenAI's lineup breaks into three distinct families, each with a different pricing philosophy.
GPT-5.x: The Workhorse Family
The GPT-5 family is where most production workloads should land. GPT-5.2 is the current flagship at $1.75/$14, but the real story is the tiering below it.
GPT-5 Mini at $0.25/$2 offers roughly GPT-4o-level quality at a fraction of the cost. GPT-5 Nano at $0.05/$0.40 is absurdly cheap and handles classification, extraction, and simple generation tasks well enough for most pipelines.
All GPT-5 models share a 400K context window, which is generous for the price tier.
GPT-4.1: The Long-Context Specialists
The GPT-4.1 family has a unique selling point: a 1M+ token context window. That is 8x larger than GPT-4o's 128K window.
If you are doing document analysis, code review across large repositories, or processing lengthy transcripts, GPT-4.1 models are the practical choice. The pricing is slightly higher than GPT-5 equivalents at the top end ($2/$8 vs $1.75/$14, note GPT-4.1 actually has cheaper output), but the Nano variant at $0.10/$0.40 makes long-context work accessible.
o-series: Reasoning Models
The o-series models use chain-of-thought reasoning, which means they think before they answer. This makes them slower and more expensive, but substantially better at math, logic, and multi-step problems.
o1 at $15/$60 is the most expensive model in the lineup and should only be used when you genuinely need deep reasoning. o3 at $2/$8 offers a much better price-to-reasoning ratio for most tasks. o3 Mini and o4 Mini at $0.50-1.10 input are reasonable for adding reasoning to cost-sensitive pipelines.
Real Cost Calculations
Let's put these numbers in context with three common use cases.
Use Case 1: Customer Support Chatbot
Assumptions: 10,000 conversations/month, average 800 tokens input (customer message + context), 400 tokens output (bot response). That is roughly 8M input tokens and 4M output tokens per month.
| Model | Monthly Cost | Quality Trade-off |
| GPT-5.2 | $70.00 | Best quality, overkill for support |
| GPT-5 Mini | $10.00 | Good quality, best value |
| GPT-5 Nano | $2.00 | Adequate for scripted flows |
| GPT-4o | $60.00 | No reason to use over GPT-5 Mini |
| GPT-4o Mini | $3.60 | Legacy option, consider GPT-5 Nano |
The winner here is GPT-5 Mini at $10/month. It handles conversational tasks well, and the 400K context window means you can stuff plenty of knowledge base content into the system prompt.
Try this calculation in our pricing calculator
Use Case 2: Document Analysis Pipeline
Assumptions: Processing 500 documents/month, each averaging 15,000 tokens. Output is structured extraction averaging 2,000 tokens per document. That is 7.5M input tokens and 1M output tokens.
| Model | Monthly Cost | Notes |
| GPT-4.1 | $23.00 | Best for very long documents (1M context) |
| GPT-4.1 Mini | $4.60 | Sweet spot for document work |
| GPT-5 Mini | $3.88 | Good if docs fit in 400K context |
| GPT-4.1 Nano | $1.15 | If extraction is simple/structured |
For document analysis, GPT-4.1 Mini is the practical choice. The 1M+ context window means you never need to chunk documents, and at $4.60/month it is hard to justify more expensive options unless quality demands it.
Use Case 3: Code Review Agent
Assumptions: 200 pull requests/month, each containing ~5,000 tokens of diff + 3,000 tokens of context. Agent generates ~2,000 tokens of review comments. That is 1.6M input and 0.4M output tokens.
| Model | Monthly Cost | Notes |
| o3 | $6.40 | Best reasoning for complex logic |
| o3 Mini | $1.60 | Good reasoning at lower cost |
| GPT-5.2 | $8.40 | Strong but reasoning models better for code review |
| GPT-4.1 | $6.40 | Good if reviewing large files |
For code review, o3 at $6.40/month is worth it. Reasoning models catch logical errors that standard models miss. If budget is tight, o3 Mini at $1.60 is a reasonable compromise.
Batch API: 50% Off
OpenAI offers a Batch API that processes requests asynchronously (results within 24 hours) at 50% of the standard price. If your workload is not latency-sensitive, nightly data processing, bulk classification, content generation pipelines, this is free money.
For example, that document analysis pipeline drops from $4.60/month to $2.30/month with GPT-4.1 Mini on the Batch API.
Cached Input Tokens
OpenAI automatically caches prompt prefixes that are reused across requests. Cached tokens cost 50% less. If your system prompt is 2,000 tokens and you send 10,000 requests, that is 20M tokens that get cached pricing instead of full price.
This matters most for high-volume applications with large system prompts. A support bot with a 3,000-token system prompt processing 50,000 requests/month saves roughly 30-40% on input costs.
Which Model Should You Actually Use?
Here is the decision framework we use with clients:
- Start with GPT-5 Mini ($0.25/$2). It handles 80% of production use cases at a price point that makes cost optimization unnecessary.
- Move to GPT-5.2 ($1.75/$14) only if you measure a quality gap that affects business metrics. Not vibes, metrics.
- Use GPT-4.1 variants when you need the 1M+ context window. Do not pay for long context if you do not need it.
- Use o3 or o3 Mini for tasks that require multi-step reasoning: math, logic, code analysis, planning. Standard models are cheaper but worse at these tasks.
- Avoid o1 ($15/$60) unless you have a specific, validated need for its reasoning depth. o3 covers most reasoning use cases at 87% less cost.
- Avoid GPT-4o ($2.50/$10) for new projects. GPT-5 Mini is cheaper and generally better. GPT-4o is a legacy model at this point.
The Hidden Costs
Token pricing is not the whole story. Watch for:
- Rate limits: Lower-tier models have higher rate limits. If you hit rate limits on GPT-5.2 and need to add retry logic or queuing, factor in the engineering time.
- Latency: o-series models are slow. If your UX requires fast responses, the reasoning models may not work even if the quality is better.
- Output tokens are expensive: Output tokens cost 4-8x more than input tokens across all models. Design your prompts to get concise outputs. A prompt that says "respond in under 100 words" can cut your output costs in half.
Comparing OpenAI to Alternatives
OpenAI is not the only game in town. Here is how the key models stack up:
- GPT-5 Mini ($0.25/$2) vs Claude Sonnet 4 ($3/$15): GPT-5 Mini is 12x cheaper on input. Sonnet 4 is better at nuanced writing and instruction-following, but for most API use cases, the price difference is hard to justify.
- GPT-5.2 ($1.75/$14) vs Gemini 2.5 Pro ($1.25/$10): Similar pricing, but Gemini offers a 1M token context window vs GPT-5.2's 400K. If you need long context without paying GPT-4.1 prices, Gemini is worth evaluating.
- o3 ($2/$8) vs DeepSeek V3.2 Reasoner ($0.28/$0.42): DeepSeek is roughly 7x cheaper for reasoning tasks. Quality varies by task, but for cost-sensitive reasoning pipelines, it is worth benchmarking.
Use our LLM Pricing Calculator to compare models side-by-side with your actual usage numbers.
How to Estimate Your Token Usage
Before you can calculate costs, you need to know how many tokens your application will use. Here are practical rules of thumb:
- 1 token is approximately 0.75 English words (or 4 characters)
- A typical chatbot message from a user is 50-200 tokens
- A system prompt ranges from 200 tokens (simple) to 4,000+ tokens (complex agent)
- A page of text is roughly 500 tokens
- A 10-page PDF is roughly 5,000-8,000 tokens
The most common mistake teams make is underestimating output tokens. If your model generates verbose responses, output costs dominate the bill. A model that outputs 500 tokens per response costs 8x more in output than one that outputs 500 tokens on GPT-5.2 ($14 per million output tokens).
Measure your actual token usage for a week before committing to cost projections. The difference between estimated and actual usage is often 2-3x.
Pricing Trends
OpenAI has consistently dropped prices over time. GPT-4o launched at higher prices than what GPT-5 Mini costs today for similar capability. The pattern is clear: flagship models get expensive, then get undercut by the next generation's mid-tier model within 6-12 months.
The practical takeaway: do not over-optimize your model choice. Pick the cheapest model that meets your quality bar, and expect that model to get cheaper or be replaced by something better within a year.
---
Prices verified against [OpenAI's official pricing page](https://developers.openai.com/api/docs/pricing) as of February 2026. Use our [LLM Pricing Calculator](/tools/llm-pricing-calculator) to estimate costs for your specific workload.
Need help building AI into your product?
We design, build, and integrate production AI systems. Talk directly with the engineers who'll build your solution.
Get in touchWritten by
Aniket
Aniket Kulkarni is the founder of Curlscape, an AI consulting firm that helps companies build and ship production AI systems. With experience spanning voice agents, LLM evaluation harnesses, and bespoke AI solutions, he works at the intersection of engineering and applied machine learning. He writes about practical AI implementation, model selection, and the tools shaping the AI ecosystem.
Frequently Asked Questions
What is the cheapest OpenAI API model in 2026?▼
GPT-5 Nano is the cheapest OpenAI model at $0.05 per million input tokens and $0.40 per million output tokens. For legacy models, GPT-4o Mini at $0.15/$0.60 is also very affordable. GPT-5 Nano is best suited for high-volume simple tasks like classification and entity extraction.
Should I use GPT-4o or GPT-5 Mini for new projects?▼
GPT-5 Mini ($0.25/$2) is the better choice for new projects. It is cheaper than GPT-4o ($2.50/$10) on input tokens and offers a larger 400K context window compared to GPT-4o's 128K. GPT-4o is effectively a legacy model at this point.
How much does it cost to run a chatbot on OpenAI's API?▼
A typical customer support chatbot processing 10,000 conversations per month costs roughly $10/month on GPT-5 Mini, $2/month on GPT-5 Nano, or $70/month on GPT-5.2. The exact cost depends on conversation length and complexity. Use our LLM Pricing Calculator to estimate costs for your specific volume.
What is the difference between GPT-5.2 and o3?▼
GPT-5.2 ($1.75/$14) is a general-purpose model optimized for broad tasks including text, vision, and reasoning. o3 ($2/$8) is a dedicated reasoning model that uses chain-of-thought processing, making it slower but better at math, logic, and multi-step problems. Use GPT-5.2 for general tasks and o3 when reasoning quality matters.
Continue Reading

The Hidden Math of Text: A Guide to Quantitative Analysis

Fine-Tune an LLM to Mask PII in 2 Hours with Axolotl — Step-by-Step Tutorial
Learn to fine-tune an LLM for PII redaction using Axolotl and Modal. Step-by-step tutorial covering QLoRA, dataset preparation, and production deployment for GDPR compliance.

OpenAI vs Gemini vs Claude: Complete API Pricing Comparison (2026)
Compare API costs for OpenAI GPT-5, Google Gemini, and open-source models. See real pricing per 1M tokens, cost breakdowns for 1M document inference, and when self-hosting saves 5x.