Is it possible to control which web part properties are indexed?

Tom Troughton asked on May 13, 2016 13:52

It looks as though Kentico's Pages index type automatically indexes the content of all text fields for web parts added to an indexed page. In some cases text fields are used to store data that isn't content (for example a Uni Selector configured to return a list of node GUIDs). I'm just wondering if there is any way to exclude certain web part fields from being indexed in this way?

By the time the data hits the DocumentEvents.GetContent.Execute event it's just one long string so it's not possible to exclude specific fields.

Recent Answers


Chetan Sharma answered on May 13, 2016 14:00

Go to a particular Page types and on you left hand side where you see panel, goto Search fields. You will find all page type fields having 3 poperties Index, Content, Tokenized.

By checking and unchecking various check boxed you can control what do you want to index.

By the way these are not web part's field but fields asscoiated with your page type that you are indexing.

You may exclude a field within a page type by unchecking it's check boxes.

Screen shot attached

0 votesVote for this answer Mark as a Correct answer

Tom Troughton answered on May 13, 2016 14:31

Thanks but read again. I'm talking about web part properties.

1 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on May 13, 2016 14:44 (last edited on May 13, 2016 14:57)

Webpart properties are not indexed because webparts are not content. Webparts are responsible for displaying content which comes from page types or document content (editable text, editable images, etc), etc.

If you want to exclude specific fields, you do that at the page type level. So go to the speicifc page type you're working with (events, articles, etc.) and click on the Search fields tab. There you can remove the flag for Content or Searchable and they will not show in the search index.

0 votesVote for this answer Mark as a Correct answer

Chetan Sharma answered on May 13, 2016 14:49

Hi Nat, Are you sure you are talking about Smart Search for Web parts and not page types?

Are you saying that Kentico via Lucene Indexing does indexes data of all text fields for web parts added to an indexed page?

Just want to understand what do you mean by this "automatically indexes the content of all text fields for web parts added to an indexed page"

Indexing is associated with a Page which is child of any Page type. Web parts are added to a template.

So I'm bit confused by "Text fields for web parts"? What are these? Can you sight an example here?

Thanks, Chetan

0 votesVote for this answer Mark as a Correct answer

Tom Troughton answered on May 13, 2016 14:54

Hi Brenden, that's what I understood to be the case. But in my testing I've created a custom CMSAbstractWebPart web part with a long text field associated with a rich text form control. I've added this to a page which is part of an index. When I debug an event handler for DocumentEvents.GetContent.Execute I see that the indexed content contains the text from my custom web part. If I examine the generated index using Luke I can also see this content.

So clearly web part properties are indexed. Which is good because I'd argue against your assertion that web parts are not content. They aren't necessarily content, but take an accordion layout for example. The panel titles are obviously content and therefore should be included in any index.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on May 13, 2016 15:11 (last edited on May 13, 2016 15:12)

Technically speaking webpart properties are NOT indexed. The content from that webpart field are stored on the document level and not at the webpart. So it is the document's content that is indexed. This is why I rarely use editable text, static text, etc. webparts. You have little to no control over what is indexed with them. Use a page type for this type of scenario.

To your example of the accordion webpart, again, these fields (accordion panes) are storing the content in the document not in the webpart. Take a look at the actual database table. You'll find in the 'cms_document.documentcontent' field, there is a reference of the webpart id and the value which was entered. Again, even though it looks like the webpart is storing the content it is not; the document stores the content and uses the webpart ID to populate the webpart when used for editing and also displaying properly on the live site.

Also if you're doing a Page Crawler search index those are much harder to filter out the content you are searching because it searches all the rendered content whether or not the content is hidden or not.

1 votesVote for this answer Mark as a Correct answer

Tom Troughton answered on May 13, 2016 16:33

@Brenden Thanks for bearing with me. I do understand how documents store content. So I think perhaps I'm not being clear. When I say 'web part properties' I'm talking about the content entered into those properties when used on a page. So I'm talking about document content entered in web parts. So going back to my original question, I was wondering if it was possible to restrict which of those properties would cause their values to be indexed when contributing to document content.

@Chetan Hopefully this also answers your question. I'm talking about document content added by web part rather than as a page type field. I'm not talking about indexing web parts themselves, I'm talking about getting more control over how the Lucene _content field is populated in a Pages index.

0 votesVote for this answer Mark as a Correct answer

Chetan Sharma answered on May 13, 2016 20:03 (last edited on May 13, 2016 20:03)

Hi Nat,

Per my understanding Kentico doesn't index Webpart fields properties. Any property added through either a Static HTML, Editable Text, Static Text or Editable image is not indexed for search puproses. That is why I prefer to Keep my content in page types and avoid these. I am 100% sure that kentico does not index. Please send me your index if possible I can also analyze it using Luke to see what you are saying. What type of Index are you creating?

Now coming back again to your question. The answer is no if you are creating index either of page types or Custom tables. I am not sure what Kentico does for Page crawler type indexes. I have not worked on it. The only way to have more control over _content(content) field is through creating your custom index. By creating your index you can have more granular control. Let me know if you need any starter code on it.

Thanks, Chetan

0 votesVote for this answer Mark as a Correct answer

Chetan Sharma answered on May 13, 2016 20:07

BTW, somebody wanted to have a feature control like this. However this Kind of control is not available as of now.

Kentico Product ideas

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on May 13, 2016 21:05

No you're clear Nat. I'm stating the facts and it sounds as we are both on the same page. The problem is since those properties store "content" in the document and you can't really control it using OOTB tools, especially if you're using a Page Crawler index. You have the most control by using a Page type index or creating your own custom search index.

0 votesVote for this answer Mark as a Correct answer

Sven Schaetzl answered on June 7, 2016 10:58

Hello everybody,

we have to use Page Crawler Index and are really missing the possiblity to weight Information being stored in the Index. Please give kind of Blogpost etc. about how to start using an own search index with Kentico. (e.g. where I could decide that words in the URL + h1 + h2 etc. have more weight than some other random text on a page...)

Thanks, Sven

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.