Curlscape logo
← Back to all posts• Updated Aniket

OpenAI API Pricing Guide 2026: Every Model Compared

OpenAI API Pricing Guide 2026: Every Model Compared

OpenAI now has 15 models available via API. That is not a typo. Fifteen. And the pricing spreads across a 300x range, from $0.05 per million input tokens (GPT-5 Nano) to $15 per million input tokens (o1).

If you are building on the OpenAI API in 2026, picking the wrong model can mean the difference between a $50/month bill and a $5,000/month bill for the same workload. This guide breaks down every model, what it actually costs in practice, and when to use each one.

All prices are per million tokens unless stated otherwise. You can plug your own numbers into our LLM Pricing Calculator to get exact estimates.

The Full OpenAI Pricing Table (February 2026)

ModelInput (per 1M tokens)Output (per 1M tokens)Context WindowSpeedBest For
GPT-5.2$1.75$14.00400KMediumFlagship reasoning + vision
GPT-5.1$1.25$10.00400KMediumPrevious-gen flagship
GPT-5$1.25$10.00400KMediumPrevious-gen flagship
GPT-5 Mini$0.25$2.00400KFastCost-effective general use
GPT-5 Nano$0.05$0.40400KVery FastHigh-volume simple tasks
GPT-4.1$2.00$8.001M+FastLong-context workloads
GPT-4.1 Mini$0.40$1.601M+Very FastBudget long-context
GPT-4.1 Nano$0.10$0.401M+Very FastCheapest long-context
GPT-4o$2.50$10.00128KFastLegacy multimodal
GPT-4o Mini$0.15$0.60128KVery FastLegacy budget option
o1$15.00$60.00200KSlowDeep reasoning
o1 Mini$1.10$4.40128KMediumBudget reasoning
o3$2.00$8.00200KSlowCurrent-gen reasoning
o3 Mini$0.50$2.00200KMediumLightweight reasoning
o4 Mini$1.10$4.40200KMediumLatest reasoning (small)

Understanding the Model Families

OpenAI's lineup breaks into three distinct families, each with a different pricing philosophy.

GPT-5.x: The Workhorse Family

The GPT-5 family is where most production workloads should land. GPT-5.2 is the current flagship at $1.75/$14, but the real story is the tiering below it.

GPT-5 Mini at $0.25/$2 offers roughly GPT-4o-level quality at a fraction of the cost. GPT-5 Nano at $0.05/$0.40 is absurdly cheap and handles classification, extraction, and simple generation tasks well enough for most pipelines.

All GPT-5 models share a 400K context window, which is generous for the price tier.

GPT-4.1: The Long-Context Specialists

The GPT-4.1 family has a unique selling point: a 1M+ token context window. That is 8x larger than GPT-4o's 128K window.

If you are doing document analysis, code review across large repositories, or processing lengthy transcripts, GPT-4.1 models are the practical choice. The pricing is slightly higher than GPT-5 equivalents at the top end ($2/$8 vs $1.75/$14, note GPT-4.1 actually has cheaper output), but the Nano variant at $0.10/$0.40 makes long-context work accessible.

o-series: Reasoning Models

The o-series models use chain-of-thought reasoning, which means they think before they answer. This makes them slower and more expensive, but substantially better at math, logic, and multi-step problems.

o1 at $15/$60 is the most expensive model in the lineup and should only be used when you genuinely need deep reasoning. o3 at $2/$8 offers a much better price-to-reasoning ratio for most tasks. o3 Mini and o4 Mini at $0.50-1.10 input are reasonable for adding reasoning to cost-sensitive pipelines.

Real Cost Calculations

Let's put these numbers in context with three common use cases.

Use Case 1: Customer Support Chatbot

Assumptions: 10,000 conversations/month, average 800 tokens input (customer message + context), 400 tokens output (bot response). That is roughly 8M input tokens and 4M output tokens per month.

ModelMonthly CostQuality Trade-off
GPT-5.2$70.00Best quality, overkill for support
GPT-5 Mini$10.00Good quality, best value
GPT-5 Nano$2.00Adequate for scripted flows
GPT-4o$60.00No reason to use over GPT-5 Mini
GPT-4o Mini$3.60Legacy option, consider GPT-5 Nano

The winner here is GPT-5 Mini at $10/month. It handles conversational tasks well, and the 400K context window means you can stuff plenty of knowledge base content into the system prompt.

Try this calculation in our pricing calculator

Use Case 2: Document Analysis Pipeline

Assumptions: Processing 500 documents/month, each averaging 15,000 tokens. Output is structured extraction averaging 2,000 tokens per document. That is 7.5M input tokens and 1M output tokens.

ModelMonthly CostNotes
GPT-4.1$23.00Best for very long documents (1M context)
GPT-4.1 Mini$4.60Sweet spot for document work
GPT-5 Mini$3.88Good if docs fit in 400K context
GPT-4.1 Nano$1.15If extraction is simple/structured

For document analysis, GPT-4.1 Mini is the practical choice. The 1M+ context window means you never need to chunk documents, and at $4.60/month it is hard to justify more expensive options unless quality demands it.

Try this calculation

Use Case 3: Code Review Agent

Assumptions: 200 pull requests/month, each containing ~5,000 tokens of diff + 3,000 tokens of context. Agent generates ~2,000 tokens of review comments. That is 1.6M input and 0.4M output tokens.

ModelMonthly CostNotes
o3$6.40Best reasoning for complex logic
o3 Mini$1.60Good reasoning at lower cost
GPT-5.2$8.40Strong but reasoning models better for code review
GPT-4.1$6.40Good if reviewing large files

For code review, o3 at $6.40/month is worth it. Reasoning models catch logical errors that standard models miss. If budget is tight, o3 Mini at $1.60 is a reasonable compromise.

Try this calculation

Batch API: 50% Off

OpenAI offers a Batch API that processes requests asynchronously (results within 24 hours) at 50% of the standard price. If your workload is not latency-sensitive, nightly data processing, bulk classification, content generation pipelines, this is free money.

For example, that document analysis pipeline drops from $4.60/month to $2.30/month with GPT-4.1 Mini on the Batch API.

Cached Input Tokens

OpenAI automatically caches prompt prefixes that are reused across requests. Cached tokens cost 50% less. If your system prompt is 2,000 tokens and you send 10,000 requests, that is 20M tokens that get cached pricing instead of full price.

This matters most for high-volume applications with large system prompts. A support bot with a 3,000-token system prompt processing 50,000 requests/month saves roughly 30-40% on input costs.

Which Model Should You Actually Use?

Here is the decision framework we use with clients:

  • Start with GPT-5 Mini ($0.25/$2). It handles 80% of production use cases at a price point that makes cost optimization unnecessary.
  • Move to GPT-5.2 ($1.75/$14) only if you measure a quality gap that affects business metrics. Not vibes, metrics.
  • Use GPT-4.1 variants when you need the 1M+ context window. Do not pay for long context if you do not need it.
  • Use o3 or o3 Mini for tasks that require multi-step reasoning: math, logic, code analysis, planning. Standard models are cheaper but worse at these tasks.
  • Avoid o1 ($15/$60) unless you have a specific, validated need for its reasoning depth. o3 covers most reasoning use cases at 87% less cost.
  • Avoid GPT-4o ($2.50/$10) for new projects. GPT-5 Mini is cheaper and generally better. GPT-4o is a legacy model at this point.

The Hidden Costs

Token pricing is not the whole story. Watch for:

  • Rate limits: Lower-tier models have higher rate limits. If you hit rate limits on GPT-5.2 and need to add retry logic or queuing, factor in the engineering time.
  • Latency: o-series models are slow. If your UX requires fast responses, the reasoning models may not work even if the quality is better.
  • Output tokens are expensive: Output tokens cost 4-8x more than input tokens across all models. Design your prompts to get concise outputs. A prompt that says "respond in under 100 words" can cut your output costs in half.

Comparing OpenAI to Alternatives

OpenAI is not the only game in town. Here is how the key models stack up:

  • GPT-5 Mini ($0.25/$2) vs Claude Sonnet 4 ($3/$15): GPT-5 Mini is 12x cheaper on input. Sonnet 4 is better at nuanced writing and instruction-following, but for most API use cases, the price difference is hard to justify.
  • GPT-5.2 ($1.75/$14) vs Gemini 2.5 Pro ($1.25/$10): Similar pricing, but Gemini offers a 1M token context window vs GPT-5.2's 400K. If you need long context without paying GPT-4.1 prices, Gemini is worth evaluating.
  • o3 ($2/$8) vs DeepSeek V3.2 Reasoner ($0.28/$0.42): DeepSeek is roughly 7x cheaper for reasoning tasks. Quality varies by task, but for cost-sensitive reasoning pipelines, it is worth benchmarking.

Use our LLM Pricing Calculator to compare models side-by-side with your actual usage numbers.

How to Estimate Your Token Usage

Before you can calculate costs, you need to know how many tokens your application will use. Here are practical rules of thumb:

  • 1 token is approximately 0.75 English words (or 4 characters)
  • A typical chatbot message from a user is 50-200 tokens
  • A system prompt ranges from 200 tokens (simple) to 4,000+ tokens (complex agent)
  • A page of text is roughly 500 tokens
  • A 10-page PDF is roughly 5,000-8,000 tokens

The most common mistake teams make is underestimating output tokens. If your model generates verbose responses, output costs dominate the bill. A model that outputs 500 tokens per response costs 8x more in output than one that outputs 500 tokens on GPT-5.2 ($14 per million output tokens).

Measure your actual token usage for a week before committing to cost projections. The difference between estimated and actual usage is often 2-3x.

OpenAI has consistently dropped prices over time. GPT-4o launched at higher prices than what GPT-5 Mini costs today for similar capability. The pattern is clear: flagship models get expensive, then get undercut by the next generation's mid-tier model within 6-12 months.

The practical takeaway: do not over-optimize your model choice. Pick the cheapest model that meets your quality bar, and expect that model to get cheaper or be replaced by something better within a year.

---

Prices verified against [OpenAI's official pricing page](https://developers.openai.com/api/docs/pricing) as of February 2026. Use our [LLM Pricing Calculator](/tools/llm-pricing-calculator) to estimate costs for your specific workload.

Need help building AI into your product?

We design, build, and integrate production AI systems. Talk directly with the engineers who'll build your solution.

Get in touch

Written by

Aniket

Aniket Kulkarni is the founder of Curlscape, an AI consulting firm that helps companies build and ship production AI systems. With experience spanning voice agents, LLM evaluation harnesses, and bespoke AI solutions, he works at the intersection of engineering and applied machine learning. He writes about practical AI implementation, model selection, and the tools shaping the AI ecosystem.

View all posts →

Frequently Asked Questions

What is the cheapest OpenAI API model in 2026?

GPT-5 Nano is the cheapest OpenAI model at $0.05 per million input tokens and $0.40 per million output tokens. For legacy models, GPT-4o Mini at $0.15/$0.60 is also very affordable. GPT-5 Nano is best suited for high-volume simple tasks like classification and entity extraction.

Should I use GPT-4o or GPT-5 Mini for new projects?

GPT-5 Mini ($0.25/$2) is the better choice for new projects. It is cheaper than GPT-4o ($2.50/$10) on input tokens and offers a larger 400K context window compared to GPT-4o's 128K. GPT-4o is effectively a legacy model at this point.

How much does it cost to run a chatbot on OpenAI's API?

A typical customer support chatbot processing 10,000 conversations per month costs roughly $10/month on GPT-5 Mini, $2/month on GPT-5 Nano, or $70/month on GPT-5.2. The exact cost depends on conversation length and complexity. Use our LLM Pricing Calculator to estimate costs for your specific volume.

What is the difference between GPT-5.2 and o3?

GPT-5.2 ($1.75/$14) is a general-purpose model optimized for broad tasks including text, vision, and reasoning. o3 ($2/$8) is a dedicated reasoning model that uses chain-of-thought processing, making it slower but better at math, logic, and multi-step problems. Use GPT-5.2 for general tasks and o3 when reasoning quality matters.

Continue Reading

Get in Touch