Why does Swiftype crawl/index fewer pages than are on my site?


We have noticed that the number of crawled results in the Contents tab is fewer than the actual number pages on the site. How can we crawl all of our content?

Here are some of the most common reasons Swiftype might not be able to crawl all of your site’s content:

  • The missing page content isn’t linked to from other known parts of the site, nor included in the domain’s sitemap.xml file.

  • There are path rules configured in the Domain settings page that restrict the crawler to specific section(s) of the site.

  • Your site has a robots.txt file that Disallow’s Swiftbot or all search agents from specific paths in the domain.

  • Pages on the site have robots meta tags set to noindex and/or nofollow

  • The site template uses canonical tags that are configured to point at a URL different from the one you expect to be indexed.

Troubleshooting and addressing any of the above will help your Swiftype crawl be more successful, and it will also help with other search engines such as Google.