Anthropic Upgrades Claude 3.5 Sonnet and Haiku with New Computer Control Feature

October 22, 2024 at 5:29:31 PM - Trending 🔥

TL;DR Anthropic announced upgraded Claude 3.5 Sonnet and new Claude 3.5 Haiku models. Claude 3.5 Sonnet shows significant coding and tool use improvements, while Claude 3.5 Haiku offers state-of-the-art performance at similar cost and speed. A new capability, computer use, allows Claude to interact with computers like humans, currently in public beta. Early adopters include Asana, Canva, and Replit. These models are available on Anthropic API, Amazon Bedrock, and Google Cloud.

Anthropic Upgrades Claude 3.5 Sonnet and Haiku with New Computer Control Feature

Anthropic has announced an upgraded AI model, Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. The upgraded Claude 3.5 Sonnet shows significant improvements, especially in coding, and introduces a groundbreaking capability in public beta: computer use. This allows Claude to interact with computers like humans, performing tasks such as moving a cursor, clicking buttons, and typing text.

Claude 3.5 Sonnet

The upgraded Claude 3.5 Sonnet demonstrates wide-ranging improvements on industry benchmarks, particularly in agentic coding and tool use tasks. Key performance metrics include:

  • SWE-bench Verified: Improved from 33.4% to 49.0%.
  • TAU-bench: Improved from 62.6% to 69.2% in retail and from 36.0% to 46.0% in the airline domain.

Early feedback indicates significant advancements in AI-powered coding, with companies like GitLab, Cognition, and The Browser Company reporting substantial improvements. Joint pre-deployment testing was conducted by the US AI Safety Institute (US AISI) and the UK Safety Institute (UK AISI), confirming the model's safety.

Claude 3.5 Haiku

Claude 3.5 Haiku, the next generation of the fastest model, offers improvements across all skill sets at the same cost and speed as its predecessor. It outperforms the previous largest model, Claude 3 Opus, on many intelligence benchmarks, particularly in coding tasks:

  • SWE-bench Verified: Scores 40.6%.

Claude 3.5 Haiku is suitable for user-facing products, specialized sub-agent tasks, and generating personalized experiences from large data volumes. It will be available later this month on the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI.

Computer Use Capability

The new computer use capability allows Claude to perform tasks by interacting with computer interfaces. This includes automating repetitive processes, building and testing software, and conducting open-ended tasks like research. Key performance on OSWorld:

  • Screenshot-only category: Scored 14.9%, better than the next-best AI system's 7.8%.
  • More steps allowed: Scored 22.0%.

Anthropic Upgrades Claude 3.5 Sonnet and Haiku with New Computer Control Feature

While the capability is still experimental and imperfect, it is expected to improve rapidly. Safety measures include new classifiers to identify misuse and prevent harm.

Anthropic aims to learn from initial deployments to better understand the potential and implications of increasingly capable AI systems. They encourage developers to explore the new models and provide feedback to help refine these capabilities.

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

Anthropic Introduces Claude Enterprise

Anthropic Introduces Claude Enterprise

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
BigQuery ML Integrates Anthropic Claude AI for Generative Text

BigQuery ML Integrates Anthropic Claude AI for Generative Text

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Anthropic Launches Prompt Caching for Claude, Reduces Costs by Up to 90% Trending ️‍🔥

Anthropic Launches Prompt Caching for Claude, Reduces Costs by Up to 90%

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
The Ultimate Google Analytics Audit Tool

The Ultimate Google Analytics Audit Tool

Sponsored
GA4 Auditor
GA4 Auditor

Verified Sponsor

Verified Sponsor

GA4 Auditor is a Verified Sponsor. Want to get featured here? Contact us.

Verified Sponsor
UK Antitrust Regulator Probes Google's Investment in AI Rival Anthropic

UK Antitrust Regulator Probes Google's Investment in AI Rival Anthropic

Anthropic Launches Claude Android App

Anthropic Launches Claude Android App

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Claude Introduces Sharing and Remixing for Artifacts

Claude Introduces Sharing and Remixing for Artifacts

Claude Adds Test Case Generation, Output Comparison, and Prompt Evaluation

Claude Adds Test Case Generation, Output Comparison, and Prompt Evaluation

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source

Related Tools

GA4 Auditor logo

GA4 Auditor

Verified Tool

Verified Tool

GA4 Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated GA4 audits with actionable insights

Get Featured Here

Showcase your tool in this list.

Contact Us