Smart search: Issues with "-" in some queries

Deb Apprille asked on June 3, 2019 19:19

Settings:

Smart search type = Local index

Smart search analyzer type = Subset

Smart search box: search options = None


Summary:

Some of our products' model numbers have hyphens. We are having issues returning some of them in smart search. Below are several different search results (with pics) so you can see what happens with different searches. YES means the search works as expected. NO means it does not.

  1. ZH 32 YES.
  2. ZH-32 NO.
  3. IEC 61000-4-2 YES.
  4. 61000-4-2 NO.
  5. Sub-Zero YES.
  6. PT42-5KW YES.

The only patterns I can find here are:

  1. Search queries with only numbers (and a hyphen) do not work.
  2. Search queries with a hyphen as the third character do not work.

Feel free to play around with the search -- here is the link.

Our site: www.atecorp.com

Can someone offer a suggestion on how to fix this issue? Or is this just a bug?

Thank you,

Deb A.

Correct Answer

Brenden Kehren answered on June 4, 2019 16:30

Did you rebuild your search index after each change?

Also, the fields storing the data your searching, are the fields set to searchable content fields in the page type?

Lastly, the analyzer type does make a big difference because some types don't handle symbols or special characters very well if, at all. I suggest using the Lucene tool called Luke to see what is in your index and whats being returned on your queries. Not gonna lie, it's not an easy tool to use but it will help. Shamless plug on a blog post I wrote regarding Luke and troubleshooting dates in Smart Search.

0 votesVote for this answer Unmark Correct answer

Recent Answers


Brenden Kehren answered on June 4, 2019 03:26

I believe you need to change the analyzer type to "Simple / Stop words / White space with stemming" because subset won't work properly.

https://docs.kentico.com/k11/configuring-kentico/setting-up-search-on-your-website/using-locally-stored-search-indexes/creating-local-search-indexes#Creatinglocalsearchindexes-Reference-Searchindexproperties

0 votesVote for this answer Mark as a Correct answer

Deb Apprille answered on June 4, 2019 15:35 (last edited on June 4, 2019 16:04)

Hi Brenden, thanks for your reply. I just tried those analyzer types and none of them work. I am now going through and trying all of the other types because I need a solution. I am doubtful any of them will work any better.

Another weird thing I noticed if I search for ZH-2 I get results but if I search for ZH-32 I get nothing.

0 votesVote for this answer Mark as a Correct answer

Deb Apprille answered on June 4, 2019 16:22

Update: None of the analyzer types worked. I am very puzzled why some search queries are returned without issue (ZH-2) and others don't match anything (ZH-32).

0 votesVote for this answer Mark as a Correct answer

Deb Apprille answered on June 4, 2019 16:57 (last edited on June 4, 2019 16:58)

Hi Brenden, thanks for your reply (again!). To answer your questions: yes, I rebuilt the indexes after each time. Yes, the fields are searchable and work perfectly 99% of the time, except for the cases I mentioned.

I appreciate your telling me about Luke, I will definitely check it out. In the meantime, I just learned something after some more experimentation. The reason ZH-32 was not coming back is that it was not ever mentioned in isolation on the page (it was always referred to as ZH-32-2-2). Adding a space so it became ZH-32 -2-2 fixed the issue, and now it is returning the proper result.

The analyzer type I am using, Subset, does support partial matching, and I am not sure why the hyphen is breaking the partial match. A simple experiment is a manufacturer name, Sub-Zero. If you search sub you get a partial match. But if you search sub- or sub-z you get nothing. Search the full string sub-zero and it works fine again.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on June 4, 2019 17:33

Again, it's all based on your analyzer. Even though you're seeing the results you expect quite a bit of the time, the fact that you can't see "sub-" or "sub-z" tells me the analyzer type is failing due to not matching the WHOLE string AND there is an issue with the special character.

Play around with Luke and test out what content is in your actual index. You'll only waste more time fooling around especially if you can't see what data is in your index.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on June 5, 2019 15:39

What was the issue you found Deb?

0 votesVote for this answer Mark as a Correct answer

Deb Apprille answered on June 5, 2019 15:44

The issue is that Kentico treats a hyphen the same as a space. There is nothing I can do, short of a custom analyzer, to correct this. Thanks for following up!

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on June 5, 2019 17:09

I'd challenge answer a bit more, for two reasons:

  • Kentico's search is built off of Lucene, so it's a "limitation" of Lucene, not Kentico (just to be clear).
  • I don't beleive this is a limitation at all. Mainly because you can search any kind of text, how it's analyzed is based on the analyzer type. I have a site which is using "$" as a filtering mechanizim on page data using a Smart Search index. I'm using the Standard analyzer and the field is setup as "searchable", "content" and "tokenized". The filter is wrapping the value in double quotes. So when it performs a filter (which is very similar to what you're doing), it's wrapping the value $ like so "$".

I'm not saying you're wrong, but I'm stating I would NOT create a custom analyzer because of this. If you haven't, do check out what the value is stored in your search index and try to do different queries using the Luke tool. That exposes all your indexed data and will give you a raw look at what is stored as well as create queries to get the right data.

0 votesVote for this answer Mark as a Correct answer

Deb Apprille answered on June 5, 2019 17:57

Hi Brenden,

The "filtering mechanism" (as you call it) definitely sounds like a good solution. I am curious if there a way to set it from within the Smart Search node in Kentico or if it has to be built inside the project. I don't have access to the .NET project but I do have a dev colleague that I could recruit. I'm just not sure the extra effort is worthwhile when I could just edit the text on a page to match the search query on an as-needed basis.

I found this article on custom indexes ... is this what you are talking about? If not, can you point me to a link on Kentico's website that outlines what you are suggesting?

Thanks again for your continued help!

-Deb

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.