Why does Swiftype crawl/index fewer pages than are on my site?

system · December 21, 2015, 6:04pm

We have noticed that the number of crawled results in the Contents tab is fewer than the actual number pages on the site. How can we crawl all of our content?

Here are some of the most common reasons Swiftype might not be able to crawl all of your site’s content:

The missing page content isn’t linked to from other known parts of the site, nor included in the domain’s sitemap.xml file.
There are path rules configured in the Domain settings page that restrict the crawler to specific section(s) of the site.
Your site has a robots.txt file that Disallow’s Swiftbot or all search agents from specific paths in the domain.
Pages on the site have robots meta tags set to noindex and/or nofollow
The site template uses canonical tags that are configured to point at a URL different from the one you expect to be indexed.
The missing content has been added to your site since the last full recrawl occurred. In this case requesting a recrawl from the Domains section of the dashboard should correct the issue.

Troubleshooting and addressing any of the above will help your Swiftype crawl be more successful, and it will also help with other search engines such as Google.

Topic	Replies	Views
How do I exclude parts of my site from being indexed? Setup & Indexing crawler	7778	April 11, 2016
Does Swiftype respect robots.txt files? Setup & Indexing	3822	December 21, 2015
What are canonical URLs and how do they affect Swiftype? Setup & Indexing crawler , meta-tags	5753	November 2, 2016
Swiftype not indexing General	4196	August 22, 2018
How to index dynamic pages with URL parameters Setup & Indexing	3279	May 16, 2018

Why does Swiftype crawl/index fewer pages than are on my site?

Related topics