OpenAI Launches GPT-5.4 with Advanced Reasoning Coding and Computer Use

GPT‑5.4 is OpenAI's latest frontier model released across ChatGPT (as GPT‑5.4 Thinking), the API, and Codex, designed for professional work with enhanced capability and efficiency. GPT‑5.4 Pro is also available for users needing maximum performance on complex tasks. This model integrates recent advances in reasoning, coding, and agentic workflows, combining the coding strengths of GPT‑5.3‑Codex with improved handling of tools, software environments, and professional tasks like spreadsheets, presentations, and documents. It delivers accurate, effective, and efficient results with less back-and-forth interaction.

Key Features and Improvements

ChatGPT Enhancements: GPT‑5.4 Thinking provides an upfront plan of its reasoning, allowing users to adjust the course mid-response for more aligned final outputs without extra turns. It improves deep web research for highly specific queries and maintains better context for longer, complex questions, resulting in faster, higher-quality, and more relevant answers.
Codex and API Capabilities: GPT‑5.4 is the first general-purpose model with native, state-of-the-art computer-use abilities, enabling agents to operate computers and manage complex workflows across applications. It supports up to 1 million tokens of context, allowing long-horizon task planning, execution, and verification. The model improves tool ecosystem interactions with tool search functionality, enhancing efficiency without losing intelligence. It is also the most token-efficient reasoning model to date, using fewer tokens than GPT‑5.2, which reduces token usage and speeds up processing.

Performance Benchmarks

Benchmark	GPT-5.4	GPT-5.3-Codex	GPT-5.2
GDPval (wins/ties)	83.0%	70.9%	70.9%
SWE-Bench Pro	57.7%	56.8%	55.6%
OSWorld-Verified	75.0%	74.0%*	47.3%
Toolathlon	54.6%	51.9%	46.3%
BrowseComp	82.7%	77.3%	65.8%

*GPT‑5.3‑Codex’s OSWorld-Verified score improved due to a new API parameter preserving image resolution.

Knowledge Work

GPT‑5.4 builds on GPT‑5.2’s reasoning with more consistent and polished results across real-world professional tasks. On GDPval, which tests knowledge work across 44 occupations from top U.S. industries, GPT‑5.4 matches or exceeds industry professionals in 83.0% of comparisons versus 70.9% for GPT‑5.2. It excels in creating and editing spreadsheets, presentations, and documents:

Spreadsheet modeling tasks scored 87.3% vs. 68.4% for GPT‑5.2.
Human raters preferred GPT‑5.4-generated presentations 68% of the time for aesthetics, visual variety, and image generation.
GPT‑5.4 reduces hallucinations and factual errors: individual claims are 33% less likely to be false, and full responses are 18% less likely to contain errors compared to GPT‑5.2.

Computer Use and Vision

GPT‑5.4 is the first general-purpose model with native computer-use capabilities, excelling at operating computers through code (e.g., Playwright) and UI interactions via mouse and keyboard commands from screenshots. It is steerable via developer messages and configurable for safety levels.

Achieves 75.0% success on OSWorld-Verified (desktop navigation), surpassing GPT‑5.2’s 47.3% and human performance at 72.4%.
Leads browser use benchmarks with 67.3% success on WebArena-Verified and 92.8% on Online-Mind2Web.
Improved visual perception: 81.2% success on MMMU-Pro (visual understanding) and better document parsing with a lower error rate on OmniDocBench.
Supports high-fidelity image inputs up to 10.24 million pixels, enhancing localization, understanding, and click accuracy.

Coding

GPT‑5.4 combines GPT‑5.3‑Codex’s coding strengths with enhanced knowledge work and computer-use capabilities, excelling in longer, tool-assisted tasks with less manual intervention. It matches or outperforms GPT‑5.3‑Codex on SWE-Bench Pro with lower latency.

/fast mode delivers up to 1.5x faster token velocity without sacrificing intelligence.
Excels at complex frontend tasks with more aesthetic and functional results.
Introduces “Playwright (Interactive),” an experimental Codex skill for visually debugging and testing web and Electron apps during development.

Demonstration Example

A theme park simulation game was created using GPT‑5.4 with Playwright Interactive and image generation. The game features tile-based path placement, ride and scenery construction, guest pathfinding, queueing, and ride cycles. Metrics like money, guest count, happiness, cleanliness, and rating dynamically respond to park layout and guest behavior. Playwright automated browser playtests verified smooth navigation, guest reactions, and UI stability over multiple rounds of play.

GPT‑5.4 represents a significant advance in professional AI capabilities, combining improved reasoning, coding, computer use, and visual perception to deliver faster, more accurate, and contextually aware outputs across a wide range of real-world tasks.

OpenAI Launches GPT-5.4 with Advanced Reasoning Coding and Computer Use

Key Features and Improvements

Performance Benchmarks

Knowledge Work

Computer Use and Vision

Coding

Demonstration Example

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Official Source

Related Posts

OpenAI expands ChatGPT ads with new self-serve tools and CPC bidding

ChatGPT add-on now available in Excel and Google Sheets for smarter spreadsheets

OpenAI unveils GPT 5.5 boosting coding power and AI efficiency for work

Meta Ads Audit Checklist

OpenAI launches workspace agents in ChatGPT for team workflows and automation

OpenAI Launches New Ads Bot Named OAI-AdsBot

OpenAI launches GPT-Image-2 with advanced text and image generation

OpenAI Launches Fast Efficient GPT 5.4 Mini and Nano Models

Related Tools

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

OpenAI Launches GPT-5.4 with Advanced Reasoning Coding and Computer Use

Key Features and Improvements

Performance Benchmarks

Knowledge Work

Computer Use and Vision

Coding

Demonstration Example

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Official Source

Related Posts

Related Tools

Markifact Verified Tool Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor Verified Tool Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.