php - Solr compound word splitting - how to get more relevant results -


I am struggling with solar and how to deal with compound words for our German site. We mainly deal with clothing and accessories, so our search terms are usually words related to wearable items. I have been able to fix the dictionary comprehension wordToken filter filter , so that it can mostly split mixed search terms, which we can face (for example: Schwarzkleid => Schwarz kleid).

However, the search is returning irrelevant results, it returns those items which include only the word "Schwarz" and the item that contains only the words: "Clyde". So instead of just looking at black clothes (swirlcled = black dress), I'm looking at different colors and dark colors.

Essentially returning to a solar divided token or any item that contains any keyword

My full query is: q = keyword: Schwarzkleed And deleted: 0 (where one 0 indicates that the product is not yet sold). The debug of this query is like this:

  "debug": {"raw quarters": "keyword: shrewclass and deleted: 0", "querystring": "keyword: squaredlace and deleted : 0 "," Parsequari_stustring "," Pastequari_stustring "\" B: \ "(keyword: Schwarzskeled Keywords: swarse keyword: clade) + deleted: '' (keyword: swarzcaleed keyword: swurge keyword: clade) / no-cocord) U0000 \ u0000 \ u0000 \ u0000 ",   

This returns a total of 24000+ results, if I directly query the Sector looks to me: Schwarz and Keywords: crore I find 10,000 results which is what I want. I'm using Solar 4.7 and Solar PHP libraries to interact with my web application.

How to properly query my query to get only relevant results?

Type in the field here:

  & lt ;! - German - & gt; & Lt; FieldType name = "text_de" class = "solr.TextField" statusInContantGap = "100" auto generated traffic questions = "true" & gt; & Lt; Analyzer Type = "Index" & gt; & Lt; Filter class = "solr.LowerCaseFilterFactory" /> & Lt; Filter class = "solr.StopFilterFactory" Ignore = "true" word = "lang / stopwars_diet" format = "snowball" enabled symbols = "true" /> & lt; Filter class = "solr.GermanNormalizationFilterFactory" /> & Lt; Filter class = "org.apache.lucene.analysis.de.compounds.GermanCompoundSplitterTokenFilterFactory" compileDict = "true" dataDir = "/ home / ali / download / solr-4.7.0 / example / solr / findemode-dev / conf / wordlist / "/> & Lt; Filter class = "solr.SnowballPorterFilterFactory" language = "German2" /> & Lt; Tokenizer class = "solr.StandardTokenizerFactory" /> & Lt; / Analyzer & gt; & Lt; Analyzer type = "query" & gt; & Lt; Filter class = "solr.LowerCaseFilterFactory" /> & Lt; Filter class = "solr.StopFilterFactory" Ignore = "true" word = "lang / stopwars_diet" format = "snowball" enabled symbols = "true" /> & lt; Filter class = "solr.GermanNormalizationFilterFactory" /> & Lt; Filter class = "org.apache.lucene.analysis.de.compounds.GermanCompoundSplitterTokenFilterFactory" compileDict = "false" dataDir = "/ home / ali / download / solr-4.7.0 / example / solr / findemode-dev / conf / wordlist / "/> & Lt; Filter class = "solr.SnowballPorterFilterFactory" language = "German2" /> & Lt; Tokenizer class = "solr.StandardTokenizerFactory" /> & Lt; / Analyzer & gt; & Lt; / FieldType & gt;    

I have managed to solve it (a fairly hacked way) Using filter queries and adjacs queryers

I have added the following parameter to my solrconfig.xml:

  & lt; Str name = "defType" & gt; Edismax & lt; / str & gt; & Lt; Str name = "mm" & gt; 75% & lt; / str & gt;   

Then when searching for more than one keyword (for example: Schwarzkleed Wang, where Wenz is a German brand name), I first used the keyword as a query And after that I have a filterquare. So my last query looks something like this:

  fl = id and sort = popular + Dow and Indent = on & amp; Q = keyword: 'Schwarzclayed' + & amp; Wt = json & amp; Fq = {My compress splitter filter is correctly partitioned and it is parsed as mm / 75% in the form of adizax, then added things to be filtered, they also parsed as edismax for keywords Returning results are all black dresses from 'wenz'  

If someone has posted to me, it can be a better solution, I would be more than happy to read it because I am interested in solar A'm fairly new and I think my way is a bit complicated to be honest.

Thank you.

Comments

Popular posts from this blog

Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel -

multithreading - PhantomJS-Node in a for Loop -

c++ - MATLAB .m file to .mex file using Matlab Compiler -