Indexing of page builder widgets

Nicolas Huguet-Latour asked on November 7, 2019 15:39

Hi,

We are starting to work on a brand new website for one of our clients and I would like to know if there is any (semi)legitimate way we could index page builder content. The fact that this very user-friendly feature is incompatible with content search is extremely disappointing for us.

From what I can see, page builder content is stored in the DocumentPageBuilderWidgets column of the CMS_Document table and can be fetched through the DocumentHelper class. This returns a JSON document that we could parse for content. The only thing stopping me from doing this is that it seems extremely fragile and could be broken with any Kentico update.

Are there any plans in the works to give developers a legitimate way to index content added through page builder widgets?

Thanks.

Recent Answers


Dmitry Bastron answered on November 8, 2019 10:13

Hi Nicolas,

Pages Crawler index includes the output HTML content of the page. So you can try using it. The only problem with it is that it works with local indexes only, you can't create it in Azure.

0 votesVote for this answer Mark as a Correct answer

Juraj Ondrus answered on November 8, 2019 10:40

I was thinking about the crawler too, but there is this note in the documentation: "We do not recommend using crawler indexes on MVC content-only sites. The crawler only selects pages from the site's content tree in Kentico, which may not match the actual structure of the website (in many cases, content-only pages only store data and do not represent pages on the live site)."
So, I would maybe go with a custom index and create some kind of a custom crawler service which will get the data from the front end side.

0 votesVote for this answer Mark as a Correct answer

Dmitry Bastron answered on November 8, 2019 11:40

Juraj, thanks for your comment, but this limitation can be reliefed quite easily. There are a few points to consider:

  • It is still possible to define only relevant searchable content by setting "Indexed content" for the index
  • In this setting you can select only those page types, which are "pages" (have "Use Page Tab" enabled, for example) and specify root path ("/%" for all pages)

To be honest, I can't think of the cases where some entities of the same page type can be pages but some others cannot.

0 votesVote for this answer Mark as a Correct answer

Nicolas Huguet-Latour answered on November 8, 2019 16:28

Hi Dmitry and Juraj,

Thanks for your answers. Unfortunately, we do need to use Azure Search for its faceting and suggestion features... At this time I understand that my "options" are :

  1. Using the Page Builder only where the widgets doesn't contain searchable content (images and the like).
  2. Implement my own crawler and indexing service.
  3. Hack something from the JSON content in DocumentPageBuilderWidgets.

3 is out since it's too fragile a solution. 2 is too much work and would require way too much tweaking. I'm going to have to work with the design team to try to work around this problem.

Thanks anyway!

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.