0 votes
in Education by (1.7m points)
i am just browsing article on lucene.net. i got some sample code for create index using lucene.net and few lines of code is not clear to me. here is those line

    protected void btnCreateIndex_Click(object sender, EventArgs e)

{

    IndexWriter writer = new IndexWriter(MapPath("~/searchlucene/"), new StandardAnalyzer(), false);

    IndexDocument(writer, "About Hockey", "hockey", "Hockey is a cool sport which I really like, bla bla");

    IndexDocument(writer, "Some great players", "hockey", "Some of the great players from Sweden - well Peter Forsberg, Mats Sunding, Henrik Zetterberg");

    IndexDocument(writer, "Soccer info", "soccer", "Soccer might not be as fun as hockey but it's also pretty fun");

    IndexDocument(writer, "Players", "soccer", "From Sweden we have Zlatan Ibrahimovic and Henrik Larsson. They are the most well known soccer players");

    IndexDocument(writer, "1994", "soccer", "I remember World Cup 1994 when Sweden took the bronze. we had great players. players , bla bla");

    IndexDocument(writer, "BBA-header", "BBA-321type", "Hello BBA");

    writer.Optimize();

    writer.Close();

}

private void IndexDocument(IndexWriter writer, string sHeader, string sType, string sContent)

{

    Document doc = new Document();

    doc.Add(new Field("header", sHeader, Field.Store.YES, Field.Index.TOKENIZED));

    doc.Add(new Field("type", sType, Field.Store.YES, Field.Index.TOKENIZED));

    doc.Add(new Field("content", sContent, Field.Store.YES, Field.Index.TOKENIZED));

    writer.AddDocument(doc);

}

i have couple of question

1) doc.Add(new Field("header", sHeader, Field.Store.YES, Field.Index.TOKENIZED)); what is the meaning of this line. Field.Index.TOKENIZED what is TOKENIZED & UNTOKENIZED?? when i search keyword specified in type argument then nothing is coming. just do not understand the behaviour

here is sample for search where i specify a keyword which was index as type

    ListBox1.Items.Clear();

    var searcher = new Lucene.Net.Search.IndexSearcher(MapPath("~/searchlucene/"));

    var oParser = new Lucene.Net.QueryParsers.QueryParser("content", new StandardAnalyzer());

    string sHeader = " OR (header:" + TextBox1.Text + ")";

    string sType = " OR (type:" + TextBox1.Text + ")";

    string sSearchQuery = "(" + TextBox1.Text + sHeader + sType + ")";

    var oHitColl = searcher.Search(oParser.Parse(sSearchQuery));

    for (int i = 0; i < oHitColl.Length(); i++)

    {

        Document oDoc = oHitColl.Doc(i);

        ListBox1.Items.Add(new ListItem(oDoc.Get("header") + oDoc.Get("type") +  oDoc.Get("content")));            

    }

    searcher.Close();

please someone help me to understand to drive out my confusion. thanks

JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by (1.7m points)
I just tested your code, and it works fine with Lucene 2.9.4.

Field.Index.TOKENIZED means the Analyzer will break your text in tokens, meaning it will be searchable in full-text. You would use UN_TOKENIZED for fields you dont want analyzed, like product IDs.

Note: you should use Field.Index.ANALYZED and Field.Index.NOT_ANALYZED which are the replacements for their deprecrated TOKENIZED/UN_TOKENIZED counterparts.

To see differences between analyzed and not, you can try both and use Luke to inspect your indexes, that will probably give you a good idea of how it works.

http://code.google.com/p/luke/

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
...