'field option'에 해당되는 글 2건

  1. 2013.05.27 [lucene] field options for indexing - StringField.java
  2. 2013.01.07 Basic Indexing Options for Lucene.

[lucene] field options for indexing - StringField.java

Elastic/Elasticsearch 2013. 5. 27. 10:52

lucene 3.6.X 에는 없는 클래스 입니다.

4.X 에서 처음 등장한 넘이구요.

코드를 조금 보면

public final class StringField extends Field {


  /** Indexed, not tokenized, omits norms, indexes

   *  DOCS_ONLY, not stored. */

  public static final FieldType TYPE_NOT_STORED = new FieldType();


  /** Indexed, not tokenized, omits norms, indexes

   *  DOCS_ONLY, stored */

  public static final FieldType TYPE_STORED = new FieldType();


  static {

    TYPE_NOT_STORED.setIndexed(true);

    TYPE_NOT_STORED.setOmitNorms(true);

    TYPE_NOT_STORED.setIndexOptions(IndexOptions.DOCS_ONLY);

    TYPE_NOT_STORED.setTokenized(false);

    TYPE_NOT_STORED.freeze();


    TYPE_STORED.setIndexed(true);

    TYPE_STORED.setOmitNorms(true);

    TYPE_STORED.setIndexOptions(IndexOptions.DOCS_ONLY);

    TYPE_STORED.setStored(true);

    TYPE_STORED.setTokenized(false);

    TYPE_STORED.freeze();

  }


  /** Creates a new StringField. 

   *  @param name field name

   *  @param value String value

   *  @param stored Store.YES if the content should also be stored

   *  @throws IllegalArgumentException if the field name or value is null.

   */

  public StringField(String name, String value, Store stored) {

    super(name, value, stored == Store.YES ? TYPE_STORED : TYPE_NOT_STORED);

  }

}


무조건 index true 입니다.

기존이랑은 좀 다르죠.

예전 옵션에 대한 정보만 가지고 있으면 실수 할 수도 있는 부분이라 살짝 올려봤습니다.

:

Basic Indexing Options for Lucene.

Elastic/Elasticsearch 2013. 1. 7. 14:30

아래 내용은 lucene in action 에서 뽑아온 내용입니다.


Field options for indexing
 
The options for indexing (Field.Index.*) control how the text in the field will be
made searchable via the inverted index. Here are the choices:
 
  - Index.ANALYZED—Use the analyzer to break the field’s value into a stream of separate tokens and make each token searchable. This option is useful for normal text fields (body, title, abstract, etc.).
  - Index.NOT_ANALYZED—Do index the field, but don’t analyze the String value. Instead, treat the Field’s entire value as a single token and make that token searchable. This option is useful for fields that you’d like to search on but that shouldn’t be broken up, such as URLs, file system paths, dates, personal names, Social Security numbers, and telephone numbers. This option is especially useful for enabling “exact match” searching. We indexed the id field in listings 2.1 and 2.3 using this option.
  - Index.ANALYZED_NO_NORMS—A variant of Index.ANALYZED that doesn’t store norms information in the index. Norms record index-time boost information in the index but can be memory consuming when you’re searching. Section 2.5.3 describes norms in detail.
  - Index.NOT_ANALYZED_NO_NORMS—Just like Index.NOT_ANALYZED, but also doesn’t store norms. This option is frequently used to save index space and memory usage during searching, because single-token fields don’t need the norms information unless they’re boosted.
  - Index.NO—Don’t make this field’s value available for searching.
Field options for storing fields
 
The options for stored fields (Field.Store.*) determine whether the field’s exact value should be stored away so that you can later retrieve it during searching:
 
  - Store.YES—Stores the value. When the value is stored, the original String in its entirety is recorded in the index and may be retrieved by an IndexReader. This option is useful for fields that you’d like to use when displaying the search results (such as a URL, title, or database primary key). Try not to store very large fields, if index size is a concern, as stored fields consume space in the index.
  - Store.NO—Doesn’t store the value. This option is often used along with Index.ANALYZED to index a large text field that doesn’t need to be retrieved in its original form, such as bodies of web pages, or any other type of text document.
Field options for term vectors
 
  - TermVector.YES—Records the unique terms that occurred, and their counts, in each document, but doesn’t store any positions or offsets information
  - TermVector.WITH_POSITIONS—Records the unique terms and their counts, and also the positions of each occurrence of every term, but no offsets
  - TermVector.WITH_OFFSETS—Records the unique terms and their counts, with the offsets (start and end character position) of each occurrence of every term, but no positions
  - TermVector.WITH_POSITIONS_OFFSETS—Stores unique terms and their counts, along with positions and offsets
  - TermVector.NO—Doesn’t store any term vector information
 
Note that you can’t index term vectors unless you’ve also turned on indexing for the field. Stated more directly: if Index.NO is specified for a field, you must also specify TermVector.NO.
Labels

: