API
Version 7.x > API > Searching a smart search custom index returns lowercase custom content View modes: 
User avatar
Certified Developer 8
Certified Developer 8
nrinat-ecentricarts - 5/8/2013 1:26:04 PM
   
Searching a smart search custom index returns lowercase custom content
Hello

I have a custom search index that searches PDF files and stores the keywords in the index.
This is how the custom content (for search results excerpt) is saved:
SearchHelper.AddField(doc, SearchHelper.CUSTOM_CONTENT, TextHelper.LimitLength(fileContent, 200), true, false);

My problem is with searching the index. The results are always in lowercase. Both the name of the document and the custom content are returned in lowercase even though they are not in lowercase when processed by the index.

I hope you can advise! Thank you!

Norm

User avatar
Kentico Consulting
Kentico Consulting
Kentico_RichardS - 5/9/2013 1:20:13 AM
   
RE:Searching a smart search custom index returns lowercase custom content
Hi,

Have you created your search index through this tutorial?

If so, there is a line
text = text.ToLower();

could you please comment it or remove it?

Does this help?

Kind regards,
Richard Sustek

User avatar
Certified Developer 8
Certified Developer 8
nrinat-ecentricarts - 5/9/2013 8:08:36 AM
   
RE:Searching a smart search custom index returns lowercase custom content
Thank you for the quick answer!

I do not use the tutorial index, I rather use my own. These are the steps I take in the index Rebuild event:

I create a MediaFileInfo object with the information returned from the database and extract the content:
MediaFileInfo mfi = new MediaFileInfo(dr);
string fileContent = MediaSearchHelper.GetContent(mfi);

I create a Lucene.NET document:
Document doc = SearchHelper.CreateDocument(SearchHelper.CUSTOM_SEARCH_INDEX, mfi.FileID.ToString(), SearchHelper.INVARIANT_FIELD_VALUE, mfi.FileCreatedWhen, SearchHelper.INVARIANT_FIELD_VALUE);

I prepare the content field:
        StringBuilder sb = new StringBuilder();
sb.Append(mfi.FileTitle);
sb.Append(" ");
sb.Append(mfi.FileDescription);
sb.Append(" ");
sb.Append(mfi.FileCustomData);
sb.Append(" ");
sb.Append(mfi.FileExtension);
sb.Append(" ");
sb.Append(mfi.FileMimeType);
sb.Append(" ");
sb.Append(mfi.FileName);
sb.Append(" ");
sb.Append(fileContent);

Then add it to the document (text in not in lowercase at this stage):
SearchHelper.AddField(doc, SearchHelper.CONTENT_FIELD, SearchHelper.HtmlToPlainText(sb.ToString()), false, true);

Finally I add the document to the index writer:
iw.AddDocument(doc);

When I export the index to XML using Luke I can see the information is stored in lowercase:
<doc id='1'>
<field name='_created' norm='1.0' flags='I-S--------'>
<val>20130508162047</val>
</field>
<field name='_culture' norm='1.0' flags='I-S--------'>
<val>invariantifieldivaluei</val>
</field>
<field name='_customcointent' norm='1.0' flags='I-S--------'>
<val>what should i use as a first aid symbol? international and canadian organizations responsible for standards recommend a white cross on a green background to identify a first aid kit or supplies or ...</val>
</field>
<field name='_customdate' norm='1.0' flags='I-S--------'>
<val>20130508162047</val>
</field>
<field name='_customtitle' norm='1.0' flags='I-S--------'>
<val>emblem title</val>
</field>
<field name='_customurl' norm='1.0' flags='I-S--------'>
<val>~/crc/mynewlibrary/emblem.pdf</val>
</field>
<field name='_id' norm='1.0' flags='I-S--------'>
<val>1196</val>
</field>
<field name='_myfiletype' norm='1.0' flags='I-S--------'>
<val>pdf</val>
</field>
<field name='_myid' norm='1.0' flags='I-S--------'>
<val>1196</val>
</field>
<field name='_site' norm='1.0' flags='I-S--------'>
<val>invariantifieldivaluei</val>
</field>
<field name='_type' norm='1.0' flags='I-S--------'>
<val>CUSTOM_SEARCH_INDEX</val>
</field>
</doc>

I appreciate any given help! Thank you!

Norm

User avatar
Kentico Consulting
Kentico Consulting
Kentico_RichardS - 5/10/2013 1:42:05 AM
   
RE:Searching a smart search custom index returns lowercase custom content
Hi,

unfortunately Im not familiar with the .NET Lucerne and its implementation so we are not able to help you with this..

However I have to ask - if you want to search through files - why not use our document types and its indexing and searching capabilities?

Kind regards,
Richard Sustek

User avatar
Certified Developer 9
Certified Developer 9
niels - 6/13/2013 8:08:39 AM
   
RE:Searching a smart search custom index returns lowercase custom content
Hi,

I implemented a custom search index a few months ago and also discovered your problem.

The SearchHelper.AddField() method applies the .ToLowerCSafe() to the index string.
The solution is to add the field to the index yourself, see for an example below.
using Lucene.Net.Index;
using Lucene.Net.Documents;

// Helper method
public static void AddField(Document document, string name, object value, Field.Store store, Field.Index index)
{
if (value != null)
{
string strValue = Value2String(value); // Convert any date or number fields here, so you can read them back
Field field = new Field(name, strValue, store, index);
document.Add(field);
}
}

// Lucene Document
Document document = SearchHelper.CreateDocument(...);
// Add your field
AddField(document, "tags", documentIndex.Tags, Field.Store.YES, Field.Index.TOKENIZED);

User avatar
Certified Developer 8
Certified Developer 8
nrinat-ecentricarts - 9/18/2013 8:22:52 AM
   
RE:Searching a smart search custom index returns lowercase custom content
Thank you! We have implemented this in a new project successfully.
I would advise Kentico to be aware of this problem and offer this solution as part of the API.