Further control over crawler, sections field

crawler

#1

Hello,

is there a way to have more site wide control over how the crawler decides what information gets assigned to certain fields? Specifically docs mention that the crawler populates the sections field by grabbing the h1-h6 tags. Is there a way we can tell the crawler to grab all 1st instance of h1 tags (we know its standard to only have one, but our site is legacy and we are actively working to clean this up) and populate that content into a custom field?

Let me know. Thanks.


#2

It is possible to use our body-embedded metatag syntax to index that content into a custom field: https://swiftype.com/documentation/meta_tags2#embedded

ex.

<h1 data-swiftype-name="first-h1" data-swiftype-type="string">H1 to rule them all. </h1>

This approach will require the ability to make template level changes. If you’re on a legacy system that doesn’t offer the flexibility of including custom options and parameters within a particular instance of an h1 or other element(s), this might not be viable.

If that’s the case, hit us up via support and we can discuss another potential option.


#3

Our h1-h6 are captured within the section field as well as the body. Is there any way to get the content written to the sections field to only be captured within sections and not any other field, like body? We are trying to limit repeated content within the search results page, as well as simplify translation of these strings (english to german). We are trying to avoid inaccurate translation due to perceived context when h1-h6 sections appear inline with narrative content.

Hope that makes sense.

Thank you.