in Education by
I have an issue while building my Solr index (Lucene & Solr 3.4.0 on an Apache Tomcat 6.0.33). The data for the documents to index comes out of an Oracle database. Since I have to handle loads of CLOBs, I splitted up the dataimport into several requestHandlers to increase the performance while fetching the data from the database (multithreading simulation). These requestHandlers are configured in my solrconfig.xml as follows: segment-#.xml To build the index, I start the first DataImportHandler with the clean=true option and then start the full-import of all other segments. When all segments are through, the status pages (http://host/solr/segment-#) tell me, that for each segment the correct number of rows (according to the SELECT COUNT(*) statement in the database) was fetched and processed. Fine so far. But if I now call the status page of the core (http://host/solr/admin/core) the numDocs is not the sum of all segments. There are always some documents missing. I tried the index build several times, the difference was always varying. In sum there should be 8.3 million documents in the index, but actually there are always roundabout 100.000 entries missing. The numDocs is the same number that I can find with a *:* query via the Solr admin interface. I turned on the infostream, had a look at the log entries, also the Tomcat logs but did not find a clue. What am I doing wrong? I am using 17 requestHandlers and my are configured as follows: false 17 32 50000 2000000 1000 10000 native Help is very appreciated. Thank you very much in advance! JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
I found the problem, just had to RTFM... I tricked myself because the default clean option is TRUE, I thought it was FALSE. So I just called the first URL with &clean=true instead of calling all other URLs with &clean=false. So each URL call resulted in cleaning the whole index. If I call the URLs with &clean=false, the sum of all documents is correct.

Related questions

0 votes
    I have a Nutch index crawled from a specific domain and I am using the solrindex command to push the ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 4, 2022 in Education by JackTerrance
0 votes
    What are you going to do if there is no Functional Spec or any documents related to the system and developer who wrote ... company anymore, but you have a system and need to test?...
asked Oct 17, 2020 in Technology by JackTerrance
0 votes
    Two dice are thrown at once. What is the probability of getting face upwards with “sum equal to 4 or 5”. Select the correct answer from above options...
asked Nov 25, 2021 in Education by JackTerrance
0 votes
    If two events are independent, then A. they must be mutually exclusive B. the sum of their probabilities must ... the above is correct Select the correct answer from above options...
asked Nov 20, 2021 in Education by JackTerrance
0 votes
    Puzzle : Clock Divide the clock's face into three equal parts exactly with two lines. Therefore the sum of the numbers in the three equal parts should be same...
asked Feb 12, 2021 in Education by JackTerrance
0 votes
    The elapsed time may be ________ than the user time if your machine has multiple cores/processors. (a) smaller ... of R Programming Select the correct answer from above options...
asked Feb 12, 2022 in Education by JackTerrance
0 votes
    Symfony version: 3.1.3 Due to development reason suddenly my app giving the following error and I believe ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 8, 2022 in Education by JackTerrance
0 votes
    I am trying to make search on my database using Solr, and i need to build a facet for the date ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 27, 2022 in Education by JackTerrance
0 votes
    Can you recommend a faceted query browser that I can point at a SOLR index? Ideally this would be ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 11, 2022 in Education by JackTerrance
0 votes
    Find the probability that the sum of the numbers showing on two dice is 8, given that at least one die does not show five. Select the correct answer from above options...
asked Nov 13, 2021 in Education by JackTerrance
0 votes
    I have two collections for example CollectionA and CollectionB both have common filed which is hostname Collection A ... for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 26, 2022 in Education by JackTerrance
0 votes
    Here is an example of what I've got going on: CREATE TABLE Parent (id BIGINT NOT NULL, PRIMARY ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 27, 2022 in Education by JackTerrance
0 votes
    What will happen if two thread of the same priority are called to be processed simultaneously? (a) Anyone ... Multithreading of Java Select the correct answer from above options...
asked Mar 1, 2022 in Education by JackTerrance
0 votes
    Raw data should be processed only one time. (a) True (b) False The question was asked in an interview ... questions and answers pdf, Data Science interview questions for beginners...
asked Oct 31, 2021 in Education by JackTerrance
...