Can the crawler "go through" blacklisted pages?

I wanted to know whether the crawler would still pass through a blacklisted page, even though the blacklisted page itself is not indexed.

The reason behind this is when you have some kind of summary page that lists other pages. The summary page itself should not be indexed, but the linked pages should. So if I blacklist the summary page will that prevent the crawler to reach the linked pages?

Summary page (blacklisted, should not be indexed)

  • linked page 1 (should be indexed)
  • linked page 2 (should be indexed)
  • linked page 3 (should be indexed)

Hi Valentin,

If a path is blacklisted it will be deemed invalid for crawling and the search agent will refrain from accessing it all together.

Based on your description, it sounds as though you might want to make use of robots meta tags instead. This will enable you to tell the crawler to follow links on a page but not index it. There’s more information here: Meta Tags | Swiftype Documentation

Another possible solution is to ensure the content you do want to be indexed can be discovered via sitemap: Sitemap.xml Support | Swiftype Documentation

Hi Mike,
Thanks a lot for your answer, that makes perfect sense!

1 Like