jjeong

Field options for indexing

▷
 
The options for indexing (Field.Index.*) control how the text in the field will be
made searchable via the inverted index. Here are the choices:
 
  - Index.ANALYZED—Use the analyzer to break the field’s value into a stream of separate tokens and make each token searchable. This option is useful for normal text fields (body, title, abstract, etc.).
  - Index.NOT_ANALYZED—Do index the field, but don’t analyze the String value. Instead, treat the Field’s entire value as a single token and make that token searchable. This option is useful for fields that you’d like to search on but that shouldn’t be broken up, such as URLs, file system paths, dates, personal names, Social Security numbers, and telephone numbers. This option is especially useful for enabling “exact match” searching. We indexed the id field in listings 2.1 and 2.3 using this option.
  - Index.ANALYZED_NO_NORMS—A variant of Index.ANALYZED that doesn’t store norms information in the index. Norms record index-time boost information in the index but can be memory consuming when you’re searching. Section 2.5.3 describes norms in detail.
  - Index.NOT_ANALYZED_NO_NORMS—Just like Index.NOT_ANALYZED, but also doesn’t store norms. This option is frequently used to save index space and memory usage during searching, because single-token fields don’t need the norms information unless they’re boosted.
  - Index.NO—Don’t make this field’s value available for searching.

Field options for storing fields

▷
 
The options for stored fields (Field.Store.*) determine whether the field’s exact value should be stored away so that you can later retrieve it during searching:
 
  - Store.YES—Stores the value. When the value is stored, the original String in its entirety is recorded in the index and may be retrieved by an IndexReader. This option is useful for fields that you’d like to use when displaying the search results (such as a URL, title, or database primary key). Try not to store very large fields, if index size is a concern, as stored fields consume space in the index.
  - Store.NO—Doesn’t store the value. This option is often used along with Index.ANALYZED to index a large text field that doesn’t need to be retrieved in its original form, such as bodies of web pages, or any other type of text document.

Field options for term vectors

▷
 
  - TermVector.YES—Records the unique terms that occurred, and their counts, in each document, but doesn’t store any positions or offsets information 
  - TermVector.WITH_POSITIONS—Records the unique terms and their counts, and also the positions of each occurrence of every term, but no offsets
  - TermVector.WITH_OFFSETS—Records the unique terms and their counts, with the offsets (start and end character position) of each occurrence of every term, but no positions
  - TermVector.WITH_POSITIONS_OFFSETS—Stores unique terms and their counts, along with positions and offsets
  - TermVector.NO—Doesn’t store any term vector information
 
Note that you can’t index term vectors unless you’ve also turned on indexing for the field. Stated more directly: if Index.NO is specified for a field, you must also specify TermVector.NO.

jjeong

'field option'에 해당되는 글 2건

[lucene] field options for indexing - StringField.java

Basic Indexing Options for Lucene.

티스토리툴바