Googlebot is not a single crawler but part of a centralized crawling platform used by various Google services like Google Search, Google Shopping, and AdSense. The name "Googlebot" remains from earlier times when Google had only one crawler. Each client using this platform sets its own fetch parameters, including user agent strings and byte limits for fetching URLs.
Byte Limits and Fetching Behavior
Googlebot fetches up to 2MB per URL (excluding PDFs, which have a 64MB limit). Other crawlers have different limits, with a default of 15MB for those without specific settings. When a page exceeds 2MB, Googlebot fetches only the first 2MB, including HTTP headers, and ignores any remaining bytes. This partial fetch is treated as the complete file for indexing and rendering purposes. Resources referenced within the HTML (except media and fonts) are fetched separately, each with their own byte limits.
Implications for Web Content
Most web pages are well below the 2MB limit, but pages with large inline base64 images, extensive inline CSS/JavaScript, or large menus risk pushing critical content beyond the cutoff. Content beyond the 2MB limit is not fetched, rendered, or indexed, effectively making it invisible to Googlebot.
Rendering Process
After fetching, the Web Rendering Service (WRS) processes the retrieved bytes by executing JavaScript and CSS to understand the page’s final visual and textual state. WRS does not fetch images or videos and applies the 2MB limit per resource. It operates statelessly, clearing local storage and session data between requests, which can affect how dynamic JavaScript elements are interpreted.
Best Practices for Webmasters
- Keep HTML lean: Move heavy CSS and JavaScript to external files, as these are fetched separately.
- Order critical elements early: Place meta tags,
<title>,<link>, canonicals, and essential structured data near the top of the HTML to avoid being cut off. - Monitor server performance: Slow server responses cause fetchers to back off, reducing crawl frequency.
Additional Notes
The 2MB limit is not fixed and may evolve as web content changes. Crawling is a complex, scaled process involving byte exchanges, and understanding these limits helps ensure important content is accessible to Googlebot.
The information is based on insights shared in episode 105 of the Search Off the Record podcast, posted by Gary.







