Anthropic Launches Prompt Caching for Claude, Reduces Costs by Up to 90%

August 14, 2024 at 6:55:48 PM - Trending πŸ”₯

TL;DR Anthropic's new prompt caching feature for Claude models allows users to fine-tune responses with extensive prompts, reducing costs by up to 90% and latency by up to 85%. Available in beta, it supports conversational assistants, document processing, and more. Pricing varies by model, with cached prompts being significantly cheaper. Notion is already using this feature to enhance its AI assistant, Notion AI.

Anthropic Launches Prompt Caching for Claude, Reduces Costs by Up to 90%

Prompt caching with Claude is now available in beta on the Anthropic API, offering significant cost and latency reductions. This feature allows developers to reuse extensive context across multiple API requests, reducing costs by up to 90% and latency by up to 85% for long prompts. Prompt caching is currently available for Claude 3.5 Sonnet and Claude 3 Haiku, with support for Claude 3 Opus coming soon.

Use Cases

Prompt caching is beneficial in various scenarios:

  • Conversational Agents: Reduces costs and latency for extended conversations with long instructions or uploaded documents.
  • Coding Assistants: Enhances autocomplete and codebase Q&A by maintaining a summarized version of the codebase.
  • Large Document Processing: Incorporates long-form material, including images, without increasing response latency.
  • Detailed Instruction Sets: Allows sharing extensive lists of instructions and examples to fine-tune responses.
  • Agentic Search and Tool Use: Improves performance for tasks involving multiple rounds of tool calls and iterative changes.
  • Long-Form Content Interaction: Enables users to interact with books, papers, documentation, and podcast transcripts by embedding entire documents into the prompt.

Performance Improvements

Early adopters have reported substantial improvements:

  • Chat with a Book: 79% reduction in latency and 90% cost reduction for a 100,000 token cached prompt.
  • Many-Shot Prompting: 31% reduction in latency and 86% cost reduction for a 10,000 token prompt.
  • Multi-Turn Conversation: 75% reduction in latency and 53% cost reduction for a 10-turn conversation with a long system prompt.

Pricing

Cached prompts are priced based on the number of input tokens cached and usage frequency:

  • Claude 3.5 Sonnet: Cache write costs $3.75/MTok, cache read costs $0.30/MTok.
  • Claude 3 Opus: Cache write costs $18.75/MTok, cache read costs $1.50/MTok (coming soon).
  • Claude 3 Haiku: Cache write costs $0.30/MTok, cache read costs $0.03/MTok.

To start using prompt caching, explore the documentation and pricing page on the Anthropic API.

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources πŸ‘‡

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

Audit your GA4 account in Minutes

Audit your GA4 account in Minutes

Sponsored
GA4 Auditor
GA4 Auditor

Verified Sponsor

Verified Sponsor

GA4 Auditor is a Verified Sponsor. Want to get featured here? Contact us.

Verified Sponsor
Claudei Launches Analysis Tool for Real-Time Data Insights and Code Execution

Claudei Launches Analysis Tool for Real-Time Data Insights and Code Execution

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Anthropic Upgrades Claude 3.5 Sonnet and Haiku with New Computer Control Feature Trending ️‍πŸ”₯

Anthropic Upgrades Claude 3.5 Sonnet and Haiku with New Computer Control Feature

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Anthropic Introduces Claude Enterprise

Anthropic Introduces Claude Enterprise

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
BigQuery ML Integrates Anthropic Claude AI for Generative Text

BigQuery ML Integrates Anthropic Claude AI for Generative Text

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
UK Antitrust Regulator Probes Google's Investment in AI Rival Anthropic

UK Antitrust Regulator Probes Google's Investment in AI Rival Anthropic

Anthropic Launches Claude Android App

Anthropic Launches Claude Android App

Anthropic
Anthropic

Official Source

Official Source

Anthropic is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Claude Introduces Sharing and Remixing for Artifacts

Claude Introduces Sharing and Remixing for Artifacts

Related Tools

GA4 Auditor logo

GA4 Auditor

Verified Tool

Verified Tool

GA4 Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated GA4 audits with actionable insights

Get Featured Here

Showcase your tool in this list.

Contact Us