java - updating Apache Lucene indexed files -


I am using Apache Lusen Library to create a search functionality for my website. The website is acquiring all its content from SharePoint RSSFeeds, so every time I go through the RSSFeed URL and read the content. To make search functionality faster, I've created a scheduled task to index every one hour:

  & lt; Bean id = "rssIndexerService" class = "com.lloydsbanking.webmi.service.RSSIndexService" / & gt; & Lt; Functions: Scheduled Tasks & gt; & Lt; Task scheduled ref = "RSSINX service service" method = "index urls" cron = "0 * * * mon-fry" /> & Lt; / Functions: Scheduled Tasks & gt;   

The problem is that if I make a new content, the search does not display new content, while the server is running and after the scaled work was called, if I I delete the entry, it does not show the entries entered from the still index files. Here is the indexing code:

  @ Service public category RSSIndexerService RSSReader extension {@Autowired Private RSSFeedUrl rssFeedUrl; Private index configuration index = zero; Private string indexPath = "C: \\ MI \\ index"; Logger log = logger.gatelogger (rssIN & xerservice class.jennet ()); Public Zero Index () throws IOException {date start = new date (); Index author = getIndexWriter (); log.info ("Reading all URLs in SharePoint"); Iterator & lt; Entry & lt; String, string & gt; & Gt; Entries = rssFeedUrl.getUrlMap (). Insetset (). Iterator (); Try {while (entries.hasNext ()) {entry & lt; String, string & gt; MapEntry = entries.next (); String url = mapEntry.getValue (); SyndFeed feed = RSS reader (url); (Object Entry: feed.getEntries ()) {Syndicatery Syndicatery = (Syndientary) entry; SyndContent desc = syndEntry.getDescription (); If (desc! = Null) {string text = dessi Value (); If ("text / html" .equals (desc.getType ())) {document document = new document (); Text = extractText (text); Field field = new string field ("title", syndEntry.getTitle (), field.store.YES); doc.add (fieldTitle); Field pathfield = new string field ("path", url, field.store.Y); doc.add (pathField); Doc.add (new textfield ("content", text, field.store.YES)); // new index, so we only add documents (no old document may be available): author.addDocument (doc); }}}}} {// Closi Indek Vitter (); } End of Date = New Date (); Log.info (end.getTime () - start.getTime () + "total milliseconds"); } Public indexers thrive IOException (if index (index == empty) {analyzer analyzer = new standard analyzer (version LUCENE_47); log.info ("indexing in directory" "+ indexpath +" '. Index.dir = FS directory.open (new file (indexpath)) IndexWriterConfig config = new index configuration (version LUCENE_47, analyzer); config.setOpenMode (OpenMode.CREATE_OR_APPEND); indexWriter = new index configuration (DIR, Config);} index return;} @PreDestroy Public Zero ClosureIndexHold () IOException throws (if (indexWriter! = Tap) {System.out.println ("with indexing ..."); indexWriter.close ();}}}  < / pre> 

I know that this problem can be caused by config.setOpenMode (OpenMode.CREATE_OR_APPEND), but I do not know how I can solve it.

Well I came up with the idea of ​​checking whether the directory is empty or not, if it is not then the previous indexed Delete and index each time in OpenMode.Create:

  file path = new file (System.getProperty ("java.io.tmpdir") + "\\ index"); Directory dir = FS directory Open (path); Analyst Analyzer = New Standard Analyzer (version. LUCENE_47); IndexWriterConfig config = new index configuration (version LUCENE_47, analyzer); If (path.list ()! = Null) {log.info ("Delete the previous index ..."); FileUtils.cleanDirectory (Path); } config.setOpenMode (OpenMode.CREATE);   

Then I use the addDocument ():

  if ("text / html" .equals (desc.getType ())) ... // new index, so we just add documents (can not be an old document): indexWriter.addDocument (doc); }    

Comments

Popular posts from this blog

Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel -

multithreading - PhantomJS-Node in a for Loop -

c++ - MATLAB .m file to .mex file using Matlab Compiler -