OpenAI Releases GPT-5.4 Mini and Nano for High-Volume Workloads

What happened

OpenAI released GPT-5.4 mini and GPT-5.4 nano on March 17, 2026. Mini runs more than 2x faster than GPT-5 mini and approaches GPT-5.4 on benchmarks including SWE-Bench Pro, priced at $0.75 per million input tokens. Nano targets maximum throughput at $0.20 per million input tokens and is available exclusively via API.

Why it matters

These models make GPT-5.4-class capabilities accessible for high-volume production workloads where cost and latency matter more than peak reasoning. The nano tier at $0.20/M input tokens undercuts most open-source inference hosting costs, pressuring the economics of self-hosted models for classification, extraction, and lightweight generation tasks.

Who should pay attention

Developers running high-volume API workloads (classification, extraction, summarization) who want frontier-adjacent quality at commodity pricing
Teams evaluating build-vs-buy for lightweight LLM tasks
Mobile and edge developers who need fast, cheap inference for user-facing features