The robots.txt file follows the most specific rule set. If two commands conflict, the more direct one is used. For example, if you disallow crawling of /blog/ but allow /blog/shopify-speed-optimizations/, Google will crawl the latter. This also applies to user agents; if all user agents are disallowed but Googlebot is allowed, Googlebot can crawl. Use this to make exceptions to broad rules, often by the "Allow" command.
In a robots.txt
file, the most specific rule takes precedence when commands conflict.
Example:
Disallow: /blog/
Allow: /blog/shopify-speed-optimizations/
Result: Google will crawl /blog/shopify-speed-optimizations/
because the rule is more specific.
User-Agent Example:
- All user-agents cannot crawl the blog.
Googlebot
can crawl the blog.
Result: Googlebot
can crawl the blog, while other user agents cannot.
Key Takeaway: Use specific rules to make exceptions to broader rules, such as using the Allow
command to permit crawling of particular content within a generally disallowed section.