GPT‑5.4 mini and nano are new, faster, and more efficient small models optimized for coding and subagents, bringing many strengths of GPT‑5.4 to high-volume workloads. GPT‑5.4 mini improves significantly over GPT‑5 mini in coding, reasoning, multimodal understanding, and tool use, running more than twice as fast and nearing GPT‑5.4's performance on benchmarks like SWE-Bench Pro and OSWorld-Verified. GPT‑5.4 nano is the smallest, cheapest GPT‑5.4 variant, suited for speed- and cost-sensitive tasks such as classification, data extraction, ranking, and simpler coding subagents.
These models target workloads where latency impacts user experience, such as responsive coding assistants, fast subagents, computer-using systems interpreting screenshots, and real-time multimodal reasoning. The best model in these cases balances speed, tool reliability, and professional task performance rather than size alone.
Performance and Use Cases
- GPT‑5.4 mini excels in systems combining models of different sizes, like Codex, where a larger GPT‑5.4 handles planning and judgment, while GPT‑5.4 mini subagents execute narrower subtasks in parallel (e.g., codebase search, file review).
- It is strong on multimodal tasks related to computer use, quickly interpreting dense UI screenshots.
- On OSWorld-Verified, GPT‑5.4 mini approaches GPT‑5.4 performance and outperforms GPT‑5 mini substantially.
Benchmark Scores (xhigh reasoning effort)
| Benchmark | GPT‑5.4 | GPT‑5.4 mini | GPT‑5.4 nano | GPT‑5 mini |
|---|---|---|---|---|
| SWE-Bench Pro | 57.7% | 54.4% | 52.4% | 45.7% |
| Terminal-Bench 2.0 | 75.1% | 60.0% | 46.3% | 38.2% |
| Toolathlon | 54.6% | 42.9% | 35.5% | 26.9% |
| GPQA Diamond | 93.0% | 88.0% | 82.8% | 81.6% |
| OSWorld-Verified | 75.0% | 72.1% | 39.0% | 42.0% |
Availability and Pricing
- GPT‑5.4 mini is available in API, Codex, and ChatGPT.
- API supports text/image inputs, tool use, function calling, web/file search, computer use, and skills.
- It has a 400k context window.
- Pricing: $0.75 per 1M input tokens, $4.50 per 1M output tokens.
- In Codex, GPT‑5.4 mini uses 30% of GPT‑5.4 quota, enabling cheaper, faster handling of simpler coding tasks and subagent delegation.
- In ChatGPT, available to Free and Go users via “Thinking” feature; for others, it serves as a rate limit fallback for GPT‑5.4 Thinking.
- GPT‑5.4 nano is API-only, costing $0.20 per 1M input tokens and $1.25 per 1M output tokens.
Latency estimates consider tool call duration, token sampling, and input tokens, but real-world latency varies. Costs are based on current API pricing and may change.
Overall, GPT‑5.4 mini and nano provide scalable, cost-effective options for workloads requiring fast, reliable responses, especially in coding, subagent workflows, and multimodal computer use.


