What happened
OpenAI released GPT-5.4 mini and GPT-5.4 nano on March 17, 2026. Mini runs more than 2x faster than GPT-5 mini and approaches GPT-5.4 on benchmarks including SWE-Bench Pro, priced at $0.75 per million input tokens. Nano targets maximum throughput at $0.20 per million input tokens and is available exclusively via API.
Why it matters
These models make GPT-5.4-class capabilities accessible for high-volume production workloads where cost and latency matter more than peak reasoning. The nano tier at $0.20/M input tokens undercuts most open-source inference hosting costs, pressuring the economics of self-hosted models for classification, extraction, and lightweight generation tasks.
Who should pay attention
- Developers running high-volume API workloads (classification, extraction, summarization) who want frontier-adjacent quality at commodity pricing
- Teams evaluating build-vs-buy for lightweight LLM tasks
- Mobile and edge developers who need fast, cheap inference for user-facing features