Claude Sonnet 4.5 is the leading coding model worldwide, excelling in building complex agents, computer use, reasoning, and math. It enables modern work by effectively using software tools and solving difficult problems. The release includes major product upgrades such as checkpoints in Claude Code for saving and rolling back progress, a refreshed terminal interface, a native VS Code extension, new context editing, and memory tools in the Claude API for handling longer and more complex tasks. Claude apps now support code execution and file creation within conversations, and the Claude for Chrome extension is available to Max users.
Developers receive the Claude Agent SDK, the same infrastructure powering Claude Code, allowing them to build advanced agents for a variety of tasks beyond coding. Claude Sonnet 4.5 is the most aligned frontier model to date, showing significant improvements in reducing misaligned behaviors like deception, sycophancy, and power-seeking, with enhanced defenses against prompt injection attacks. It operates under AI Safety Level 3 protections, including classifiers that detect potentially dangerous inputs related to chemical, biological, radiological, and nuclear threats, while minimizing false positives.
The model leads in software coding ability on the SWE-bench Verified evaluation and shows a major leap in real-world computer task performance on OSWorld, improving accuracy from 42.2% to 61.4%. It also demonstrates better reasoning, math skills, and domain-specific knowledge in finance, law, medicine, and STEM compared to previous models.
The Claude Agent SDK, developed over six months, solves challenges in memory management, permission systems, and subagent coordination, now publicly available for building custom AI agents. Additionally, a temporary research preview called "Imagine with Claude" showcases real-time software generation by Claude Sonnet 4.5, available to Max subscribers for five days.
Frontier Intelligence and Performance
- State-of-the-art on SWE-bench Verified for software coding.
- Maintains focus for over 30 hours on complex, multi-step tasks.
- Leads OSWorld benchmark at 61.4%, up from 42.2% four months prior.
- Demonstrates improved reasoning and math across various benchmarks.
- Excels in domain-specific knowledge for finance, law, medicine, and STEM.
Product Upgrades
- Checkpoints in Claude Code for progress saving and rollback.
- Refreshed terminal interface and native VS Code extension.
- New context editing and memory tools in Claude API.
- Code execution and file creation integrated into Claude apps.
- Claude for Chrome extension available to Max users.
Safety and Alignment
- Most aligned frontier model released by Anthropic.
- Reduced misaligned behaviors such as deception and power-seeking.
- Improved defenses against prompt injection attacks.
- AI Safety Level 3 protections with classifiers for dangerous content.
- Significant reduction in false positives for content filtering.
Claude Agent SDK
- Infrastructure behind Claude Code now available for developers.
- Supports long-running tasks, permission management, and subagent coordination.
- Enables building capable agents for diverse applications beyond coding.
Research Preview: Imagine with Claude
- Real-time software generation with no prewritten code.
- Demonstrates Claude Sonnet 4.5βs adaptability and creativity.
- Available to Max subscribers for a limited time.