K13 spurious content in search highlight

Aaron Macdonald asked on January 11, 2023 09:34

Hello

We have a basic k13 Azure Cognitive search implementation with a custom ISearchCrawlerContentProcessor to restrict the page builder content being indexed on included pages.

There is a consistent problem when a search executes that the first highlight value in the sys_content field begins with spurious content before the expected output of the ISearchCrawlerContentProcessor.

The returned highlight value has the following pattern:

My page title     My-page-title My Page title [expected ISearchCrawlerContentProcessor output here]

We have confirmed that the ISearchCrawlerContentProcessor output does not include this spurious content.

The spurious content looks to be titles and a page alias, but we're unable to identify what's causing it to be indexed.

We have checked search field configuration and that does not seem to be the cause.

We'd appreciate if anyone can offer advice on the problem.

Correct Answer

Arjan van Hugten answered on January 11, 2023 12:43

I think the 'content' checkbox is checked for those fields thus causing these fields to be added to the 'sys_content' field.

This could be default Kentico search index settings, you can look under 'modules' in the 'pages' module for the 'page' class.

Example

0 votesVote for this answer Unmark Correct answer

Recent Answers


Aaron Macdonald answered on January 12, 2023 02:17

Many thanks Arjan that has solved the problem.

We suspected it was a case of default 'content' field indexing but weren't sure how or where that was controlled.

CC Kentico: this should be in the search documentation.

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.