How to index private pay-wall content

maurice · April 5, 2017, 5:56am

Hello,

I’m wondering if the API supports private non-public content?

Our site has a pay-wall, so 99% of the content, thounsands of pages, are not available to the public/search crawlers. Is it possible to add this sort of content to the web crawler without exposing pay-wall content publically? Said another way, is it possible to provide non-public content/index to the crawler via a private

We also have thounsands of private PDFs, I have seen you do have PDF crawling, but again, will this work with non-public PDFs? I can see it being possible to create a PDF search cache index and push it as per point #3 above to Swiftype privately via an API?

Thanks.

cpatton · April 14, 2017, 11:47pm

Hey Maurice,

Yes, it does!

In order to index private content using our Crawler, you’ll need to allow our search agent access to the paywalled content by whitelisting our search agent IP block, or your Swiftype account specific User Agent ID.

To best prevent public exposure of paywalled content in search results, you could index the paid content to its own specific engine, limit public facing search requests access to the public search engine, and then customize the SERP experience for logged-in members by querying both public & premium content engines.

For searching paywalled and/or private content, we recommend making the search requests server-side as opposed to client-side, protecting your engine’s authentication parameters as well as not publicly exposing the request calls.

Indexing of private PDFs can be similarly accomplished with the above. If you are unable to whitelist our Crawler, you would need to index content using our Developer API instead.

Hope that helps!

Topic		Replies	Views
My search index contains sensitive data. How can I prevent it from being exposed? Setup & Indexing	0	16207	December 21, 2015
Can Swiftype index password protected sites? Setup & Indexing	0	8457	December 21, 2015
Can Swiftype index an intranet site? Setup & Indexing	0	5924	December 21, 2015
Swiftype search on sites that require authentication Setup & Indexing	3	5292	March 28, 2018
Swiftype Crawl indexes content thats hidden	2	3444	February 21, 2018

How to index private pay-wall content

Related topics