Applebot-Extended: Web Publishers Can Opt Out of AI Training via Robots.txt

June 14, 2024 at 10:18:02 PM

TL;DR Apple's new documentation explains how web publishers can block Applebot-Extended via robots.txt to prevent their content from training Apple's AI models. Apple relies on licensed and public data, not private data. Applebot supports meta tags to control indexing and snippets. Applebot-Extended offers additional control but does not crawl pages. Allowing Applebot-Extended can enhance AI model quality.

Applebot-Extended: Web Publishers Can Opt Out of AI Training via Robots.txt

Apple has released new documentation regarding the ability to block Applebot-Extended, allowing web publishers to opt out of having their website content used to train Apple’s foundation models for generative AI features. Apple emphasizes that it does not use private user data or interactions for training, relying instead on licensed materials and publicly available data.

Customizing Indexing Rules for Applebot

Applebot supports various robots meta tags in HTML documents to control indexing:

noindex: Prevents the page from being indexed.
nosnippet: Prevents generating a description or web answer for the page.
nofollow: Prevents following any links on the page.
none: Combines noindex, nosnippet, and nofollow.
all: Allows indexing, snippet generation, and link following.

Multiple directives can be combined in a single meta tag using a comma-separated list or multiple meta tags.

Controlling Data Usage

Apple provides an additional user agent, Applebot-Extended, which gives web publishers more control over how their content is used. To opt out, add the following rule in robots.txt:

User-agent: Applebot-Extended
Disallow: /private/

Applebot-Extended does not crawl webpages but determines how the data crawled by Applebot is used. Allowing Applebot-Extended can help improve Apple’s generative AI models.

About Search Rankings

Apple Search considers several factors for ranking web search results:

Aggregated user engagement
Relevancy and matching of search terms
Number and quality of links
User location-based signals
Webpage design characteristics

For more details, check out Apple Documentation.

Q&A

To opt out of Applebot-Extended using robots.txt, you can add the following rule:

User-agent: Applebot-Extended
Disallow: /private/

This will prevent Applebot-Extended from using your website content to train Apple’s foundation models powering generative AI features across Apple products.

To specify multiple directives in a single meta tag for Applebot, you can use a comma-separated list or multiple meta tags. Here are examples:

<meta name="robots" content="nosnippet, noindex">
<meta name="robots" content="noindex">
<meta name="robots" content="nosnippet">

Place these meta tags in the <head> section of your HTML document.

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Related Tools

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Marketing Workflows Powered by AI

Workflow Automation

Featured

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Automated audits for Google Ads and Analytics.

Ad Management

Get Featured Here

Showcase your tool in this list.

Get Featured Here

Showcase your tool in this list.

Applebot-Extended: Web Publishers Can Opt Out of AI Training via Robots.txt

Customizing Indexing Rules for Applebot

Controlling Data Usage

About Search Rankings

Q&A

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Related Posts

Google introduces new AI Mode features in Search for back-to-school season

Google Launches AI Mode in UK for Advanced Multimodal Search Queries

AI marketing workflows made simple

ChatGPT Uses Google Search Snippets When Bing Lacks Indexed Pages

Google launches Web Guide AI to organize search results for easier discovery

Alphabet Q2 Revenue Rises 14 Percent Led by Google Cloud and YouTube Ads Growth

Google Unveils New Search Console Logo Featuring Analytics and Magnifying Glass Design

DuckDuckGo adds filter to hide AI-generated images in search results

Related Tools

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

Ahrefs

Lighthouse

Surfer SEO

Sitebulb

Screpy

Blogify

Answer the Public

SEO Writing AI

Get Featured Here

Applebot-Extended: Web Publishers Can Opt Out of AI Training via Robots.txt

Customizing Indexing Rules for Applebot

Controlling Data Usage

About Search Rankings

Q&A

How to opt out of Applebot-Extended using robots.txt?

What directives does Applebot support in robots meta tags?

How to specify multiple directives in a single meta tag for Applebot?

What factors does Apple Search consider for ranking web search results?

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Related Posts

Related Tools

Markifact Verified Tool Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor Verified Tool Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

Get Featured Here

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.