I have an issue while building my Solr index (Lucene & Solr 3.4.0 on an Apache Tomcat 6.0.33).
The data for the documents to index comes out of an Oracle database. Since I have to handle loads of CLOBs, I splitted up the dataimport into several requestHandlers to increase the performance while fetching the data from the database (multithreading simulation). These requestHandlers are configured in my solrconfig.xml as follows:
segment-#.xml
To build the index, I start the first DataImportHandler with the clean=true option and then start the full-import of all other segments. When all segments are through, the status pages (http://host/solr/segment-#) tell me, that for each segment the correct number of rows (according to the SELECT COUNT(*) statement in the database) was fetched and processed. Fine so far.
But if I now call the status page of the core (http://host/solr/admin/core) the numDocs is not the sum of all segments. There are always some documents missing. I tried the index build several times, the difference was always varying. In sum there should be 8.3 million documents in the index, but actually there are always roundabout 100.000 entries missing. The numDocs is the same number that I can find with a *:* query via the Solr admin interface.
I turned on the infostream, had a look at the log entries, also the Tomcat logs but did not find a clue. What am I doing wrong?
I am using 17 requestHandlers and my are configured as follows:
false
17
32
50000
2000000
1000
10000
native
Help is very appreciated. Thank you very much in advance!
JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)