Claude Sonnet 4.6 is Anthropic's most advanced Sonnet model, featuring significant upgrades in coding, computer use, long-context reasoning, agent planning, knowledge work, and design, along with a beta 1M token context window. It is now the default model for Free and Pro plans on claude.ai and Claude Cowork, with pricing unchanged from Sonnet 4.5.
Key Improvements and Performance
Sonnet 4.6 offers vastly improved coding skills, with early users preferring it over Sonnet 4.5 about 70% of the time and even over the more advanced Claude Opus 4.5 model 59% of the time. It excels in consistency, instruction following, reducing overengineering, hallucinations, and false success claims. The model performs well on real-world office tasks previously requiring Opus-class models and shows major advancements in computer use capabilities.
Computer Use Capabilities
Sonnet 4.6 can interact with software lacking modern APIs by using a computer interface like a human, clicking and typing on virtual devices. Since its introduction in October 2024, Sonnet models have steadily improved on the OSWorld benchmark, which tests AI interaction with real software on a simulated computer. Sonnet 4.6 users report near human-level performance on complex tasks such as spreadsheet navigation and multi-step web form completion. Although it still trails expert humans, the rapid progress makes computer use increasingly practical for diverse tasks.
Safety and Security
Extensive safety evaluations show Sonnet 4.6 is as safe or safer than previous Claude models, exhibiting a warm, honest, prosocial character with strong safety behaviors and no major misalignment concerns. The model has improved resistance to prompt injection attacks, a common security risk where malicious instructions are hidden in websites, performing comparably to Opus 4.6.
Long-Context Reasoning and Planning
The 1M token context window enables Sonnet 4.6 to handle entire codebases, lengthy contracts, or multiple research papers in one request, enhancing its long-horizon planning abilities. In the Vending-Bench Arena evaluation, which simulates business management and competition, Sonnet 4.6 demonstrated a strategic approach by investing heavily early on and then shifting focus to profitability, outperforming competitors.
User Feedback and Applications
Early customers noted broad improvements, especially in frontend coding and financial analysis. Visual outputs from Sonnet 4.6 are described as more polished, with better layouts, animations, and design sensibility, requiring fewer iterations to reach production quality. Additionally, Sonnet 4.6 matches Opus 4.6 performance on OfficeQA, a benchmark assessing enterprise document comprehension, making it a significant upgrade for document-related workloads.
Overall, Claude Sonnet 4.6 offers near Opus-level intelligence at a more accessible price, with enhanced safety, computer use, coding, and long-context reasoning capabilities, making it suitable for a wide range of economically valuable tasks.


