When conducting index coverage analysis for large sites (100K+ pages), filtering by sitemap is an effective method. If all XML sitemaps are submitted individually in Google Search Console (GSC), you can view data for the total number of discovered pages. By segmenting by a specific sitemap, you can see the breakdown of indexed vs. non-indexed pages, including reasons for non-indexing.
Key Points
- Sitemap Submission: Ensure all XML sitemaps are submitted individually in GSC, not just the sitemap index.
- Content Organization: The method's effectiveness depends on how well content is organized within sitemaps. Clear grouping by sub-folder and content type is essential.
- Filtering Process:
- Select 'Pages' in GSC.
- Use the drop-down menu to choose the sitemap of interest.
- Assess overview metrics and reasons for exclusion.
- Alternative Method: Verifying sub-folders directly in GSC can be useful, especially when working with the API, but sitemap filtering is generally quicker.
- Adding Sitemaps: If individual sitemaps haven't been submitted, add them to GSC. Note that historical data won't be available immediately.
Using the sitemap filtering method provides a quick understanding of problem areas in large sites and is a smart approach for SEO analysis.