'settings' 태그의 글 목록

[Elasticsearch] Settings/Mappings 테스트 템플릿

Elastic/Elasticsearch 2018. 12. 19. 09:46

그냥 뭐 급하게 테스트 할 때 필요해서 올려 놓고 쓰려고 합니다.

(git 에 올리면 될 것을 ㅡ.ㅡ;;)

http://localhost:9200/helloworld

{

"settings": {

"index": {

"number_of_shards": 1,

"number_of_replicas": 0,

"analysis": {

"analyzer": {

"custom_analyzer": {

"tokenizer": "standard",

"filter": [

"lowercase",

"trim",

"custom_synonym"

]

},

"cnori_analyzer" : {

"type" : "custom",

"tokenizer" : "cnori_tokenizer"

}

},

"tokenizer" : {

"cnori_tokenizer" : {

"type": "nori_tokenizer",

"decompound_mode": "mixed"

}

},

"filter": {

"custom_synonym": {

"type": "synonym_graph",

"synonyms": []

}

},

"mappings": {

"_doc": {

"properties": {

"title": {

"type": "text",

"analyzer" : "cnori_analyzer",

"fielddata" : true

},

"name": {

"type": "keyword"

},

"age": {

"type": "integer"

},

"created": {

"type": "date",

"format": "strict_date_optional_time||epoch_millis"

}

저작자표시 비영리 변경금지

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[Elasticsearch] 그냥 settings, mappings, template 예제

Elastic/Elasticsearch 2018. 7. 4. 14:05

[Settings]

"settings":{

"number_of_shards":1,

"number_of_replicas":0,

"index.refresh_interval":"1h",

"index.routing.allocation.require.box_type":"indexing",

"index.similarity.default.type":"BM25",

"index":{

"analysis":{

"analyzer":{

"arirang_custom_analyzer":{

"tokenizer":"arirang_tokenizer",

"filter":[

"lowercase",

"trim",

"arirang_synonym",

"arirang_filter"

]

}

},

"filter":{

"arirang_synonym":{

"type":"synonym_graph",

"synonyms":[]

}

[Mappings]

"mappings" : {

"product": {

"_source": {

"enabled": true

},

"dynamic_templates": [

{

"strings": {

"match_mapping_type": "string",

"mapping": {

"type": "text",

"analyzer": "arirang_custom_analyzer",

"fields": {

"raw": {

"type": "keyword",

"ignore_above": 50

}

]

}

저작자표시 비영리 변경금지

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[Elasticsearch] Synonym 적용을 위한 Index Settings 설정 예시

Elastic/Elasticsearch 2016. 3. 17. 18:34

나중에 또 잊어 버릴까봐 기록합니다.

참고문서)

https://www.elastic.co/guide/en/elasticsearch/guide/current/synonyms.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html

예시)

"index": {
  "analysis": {
    "analyzer": {
      "arirang_custom": {
        "type": "custom",
        "tokenizer": "arirang_tokenizer",
        "filter": ["lowercase", "trim", "arirang_filter"]
      },
      "arirang_custom_searcher": {
        "tokenizer": "arirang_tokenizer",
        "filter": ["lowercase", "trim", "arirang_filter", "meme_synonym"]
      }
    },
    "filter": {
      "meme_synonym": {
        "type": "synonym",
        "synonyms": [
          "henry,헨리,앙리"
        ]
      }
    }
  }
}

여기서 주의할 점 몇 가지만 기록 합니다.

1. synonym analyzer 생성 시 type을 custom 으로 선언 하거나 type을 아예 선언 하지 않습니다.

2. synonym 은 filter 로 생성 해서 analyzer 에 filter 로 할당 합니다.

3. 색인 시 사용할 것인지 질의 시 사용할 것인지 장단점과 서비스 특성에 맞게 검토 합니다.

4. synonyms_path 를 이용하도록 합니다. (이건 주의라기 보다 관리적 차원)

5. match type 의 query만 사용이 가능 하며, term type 의 query를 사용하고 싶으시다면 색인 시 synonym 적용해야 합니다.

그럼 1번에서 선언 하지 않는 다는 이야기는 뭘까요?

선언 하지 않으시면 그냥 custom 으로 만들어 줍니다.

못 믿으시는 분들을 위해 아래 소스코드 투척 합니다.

[AnalysisModule.java]

String typeName = analyzerSettings.get("type");
Class<? extends AnalyzerProvider> type;
if (typeName == null) {
    if (analyzerSettings.get("tokenizer") != null) {
        // custom analyzer, need to add it
        type = CustomAnalyzerProvider.class;
    } else {
        throw new IllegalArgumentException("Analyzer [" + analyzerName + "] must have a type associated with it");
    }
} else if (typeName.equals("custom")) {
    type = CustomAnalyzerProvider.class;
} else {
    type = analyzersBindings.analyzers.get(typeName);
    if (type == null) {
        throw new IllegalArgumentException("Unknown Analyzer type [" + typeName + "] for [" + analyzerName + "]");
    }
}

저작자표시 비영리 변경금지

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[Elasticsearch] Create Index Settings & Mappings 템플릿.

Elastic/Elasticsearch 2016. 3. 2. 12:30

그냥 복습 하는 차원에서 기록 합니다.

$ curl -XPUT "http://localhost:9200/INDEX_NAME" -d'

{

"settings": {

...설정할 정보를 넣으세요...

},

"mapping": {

...설정할 정보를 넣으세요...

}

별 내용 없습니다.

그냥 rest api를 이용해서 index 생성 할 때 필요한 설정 정보를 작성하는 틀 정도 입니다.

type 정보는 mappings 안에 넣으시면 됩니다.

기본적으로 해당 설정은 지정한 INDEX_NAME 에 한해서 적용되는 것입니다. 별도의 global 설정을 하고 싶으시다면 template 기능을 활용하시기 바랍니다.

참고링크)

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html

저작자표시 비영리 변경금지

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[Elasticsearch - The Definitive Guide] Index Settings

Elastic/TheDefinitiveGuide 2015. 12. 3. 17:24

아주 좋은 글 귀가 보여서 이건 기록을 안할래야.. ^^

원문링크)

https://www.elastic.co/guide/en/elasticsearch/guide/current/_index_settings.html

원문 Snippet)

Elasticsearch comes with good defaults. Don’t twiddle these knobs until you understand what they do and why you should change them.

Elasticsearch는 잘 모르겠다 싶으시면 그냥 default 로 사용하시는게 제일 좋습니다. :)

저작자표시 비영리 변경금지

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[elasticsearch] settings & mappings 샘플용 코드...

Elastic/Elasticsearch 2014. 1. 7. 18:41

그냥 참고용으로 올려 놓는 것입니다.

각 속성들은 서비스 특성에 맞춰서 설정 하시는게 좋습니다.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-put-mapping.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html

{
    "settings" : {
        "number_of_shards" : 5,
        "number_of_replicas" : 0,
        "index" : {
            "refresh_interval" : "1s",
            "merge" : {
                "policy" : { "segments_per_tier" : 5 }
            },
            "analysis" : {
                "analyzer" : {
                    "analyzer_standard" : {
                        "type" : "standard",
                        "tokenizer" : "whitespace",
                        "filter" : ["lowercase", "trim"]
                    },
                    "analyzer_pattern" : {
                        "type" : "custom",
                        "tokenizer" : "tokenizer_pattern",
                        "filter" : ["lowercase", "trim"]
                    },
                    "analyzer_ngram" : {
                        "type" : "custom",
                        "tokenizer" : "tokenizer_ngram",
                        "filter" : ["lowercase", "trim"]
                    }
                },
                "tokenizer" : {
                    "tokenizer_ngram" : {
                        "type" : "nGram",
                        "min_gram" : "2",
                        "max_gram" : "10",
                        "token_chars": [ "letter", "digit" ]
                    },
                    "tokenizer_pattern" : {
                        "type" : "pattern",
                        "pattern" : ","
                    }
                }
            },
            "store" : {
                "type" : "mmapfs",
                "compress" : {
                    "stored" : true,
                    "tv" : true
                }
            }
        }
    },
    "mappings" : {
        "INDICE_TYPE_NAME" : {
            "_id" : {
                "index" : "not_analyzed",
                "path" : "KEY_FIELD_NAME"
            },
            "_source" : {
                "enabled" : "true"
            },
            "_all" : {
                "enabled" : "false"
            },
            "_boost" : {
                "name" : "_boost",
                "null_value" : 1.0
            },
            "analyzer" : "analyzer_standard",
            "index_analyzer" : "analyzer_standard",
            "search_analyzer" : "analyzer_standard",
            "properties" : {
                "LONG_KEY_FIELD" : {"type" : "long", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "ignore_malformed" : true, "include_in_all" : false},
                "STRING_SEARCH_FIELD" : {"type" : "string", "store" : "no", "index" : "analyzed", "omit_norms" : false, "index_options" : "offsets", "term_vector" : "with_positions_offsets", "include_in_all" : false},
                "STRING_VIEW_FIELD" : {"type" : "string", "store" : "yes", "index" : "no", "include_in_all" : false},
                "INTEGER_KEY_FIELD" : {"type" : "integer", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "ignore_malformed" : true, "include_in_all" : false},
                "FLOAT_KEY_FIELD" : {"type" : "float", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "ignore_malformed" : true, "include_in_all" : false},
                "LONG_VIEW_FIELD" : {"type" : "long", "store" : "yes", "index" : "no", "ignore_malformed" : true, "include_in_all" : false},
                "STRING_KEY_FIELD" : {"type" : "string", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "include_in_all" : false},
                "NESTED_KEY_FIELD" : {"type" : "nested",
                "properties" : {
                    "STRING_KEY_FIELD" : {"type" : "string", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "include_in_all" : false},
                    "INTEGER_VIEW_FIELD" : {"type" : "integer", "store" : "yes", "index" : "no", "ignore_malformed" : true, "include_in_all" : false}
                    }
                },
                "BOOLEAN_VIEW_FIELD" : {"type" : "boolean", "store" : "yes", "include_in_all" : false},
                "BOOLEAN_KEY_FIELD" : {"type" : "boolean", "store" : "no", "index" : "not_analyzed", "omit_norms" : true, "index_options" : "docs", "include_in_all" : false},
                "OBJECT_VIEW_FIELD" : {"type" : "object", "dynamic" : true, "store" : "yes", "index" : "no", "include_in_all" : false}
            }
        }
    }
}

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[elasticsearch] Java API : settings property.

Elastic/Elasticsearch 2013. 4. 16. 10:54

본 문서는 개인적인 테스트와 elasticsearch.org 그리고 community 등을 참고해서 작성된 것이며,

정보 교환이 목적입니다.

잘못된 부분에 대해서는 지적 부탁 드립니다.

(예시 코드는 성능 및 보안 검증이 되지 않았습니다.)

[elasticsearch java api 리뷰]

원문 링크

http://www.elasticsearch.org/guide/reference/modules/

http://www.elasticsearch.org/guide/reference/index-modules/

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings/

이번 문서는 Java API에서도 제공하고 있는 settings 관련 설정 값들 입니다.

물론 cluster.settings 와 index.settings 도 있기 때문에 모두 확인을 하셔야 합니다.

보통 cluster 와 index 에 대한 설정들은 모두 global setting 을 사용하도록 구성 하기 때문에 elasticsearch.yml 을 구성 할 때 활용 하시면 됩니다.

업데이트 세팅과 작성된 JSON 형식의 예제를 확인해 보도록 하겠습니다.

[admin indices update settings]

Setting	Description
`index.number_of_replicas`	The number of replicas each shard has.
`index.auto_expand_replicas`	Set to an actual value (like `0-all`) or `false` to disable it.
`index.blocks.read_only`	Set to `true` to have the index read only. `false` to allow writes and metadata changes.
`index.blocks.read`	Set to `true` to disable read operations against the index.
`index.blocks.write`	Set to `true` to disable write operations against the index.
`index.blocks.metadata`	Set to `true` to disable metadata operations against the index.
`index.refresh_interval`	The async refresh interval of a shard.
`index.term_index_interval`	The Lucene index term interval. Only applies to newly created docs.
`index.term_index_divisor`	The Lucene reader term index divisor.
`index.translog.flush_threshold_ops`	When to flush based on operations.
`index.translog.flush_threshold_size`	When to flush based on translog (bytes) size.
`index.translog.flush_threshold_period`	When to flush based on a period of not flushing.
`index.translog.disable_flush`	Disables flushing. Note, should be set for a short interval and then enabled.
`index.cache.filter.max_size`	The maximum size of filter cache (per segment in shard). Set to `-1` to disable.
`index.cache.filter.expire`	The expire after access time for filter cache. Set to `-1` to disable.
`index.gateway.snapshot_interval`	The gateway snapshot interval (only applies to shared gateways).
merge policy	All the settings for the merge policy currently configured. A different merge policy can’t be set.
`index.routing.allocation.include.*`	A node matching any rule will be allowed to host shards from the index.
`index.routing.allocation.exclude.*`	A node matching any rule will NOT be allowed to host shards from the index.
`index.routing.allocation.require.*`	Only nodes matching all rules will be allowed to host shards from the index.
`index.routing.allocation.total_shards_per_node`	Controls the total number of shards allowed to be allocated on a single node. Defaults to unbounded.
`index.recovery.initial_shards`	When using local gateway a particular shard is recovered only if there can be allocated quorum shards in the cluster. It can be set to `quorum` (default), `quorum-1` (or `half`), `full` and `full-1`. Number values are also supported, e.g. `1`.
`index.gc_deletes`
`index.ttl.disable_purge`	Disables temporarily the purge of expired docs.

이 세팅 값들은 서비스 특성에 맞게 구성을 하셔야 합니다.

아래는 위 속성들에 대한 참고용 입니다.

[Sample JSON String]

curl -XPUT 'http://localhost:9200/test/' -d '{

"settings" : {

"number_of_shards" : 5,

"number_of_replicas" : 1,

"index" : {

"analysis" : {

"analyzer" : {

"default" : {

"type" : "standard",

"tokenizer" : "standard",

"filter" : ["lowercase", "trim"]

},

"default_index" : {

"type" : "standard",

"tokenizer" : "standard",

"filter" : ["lowercase", "trim"]

},

"default_search" : {

"type" : "standard",

"tokenizer" : "standard",

"filter" : ["lowercase", "trim"]

},

"my_analyzer1" : {

"tokenizer" : "standard",

"filter" : ["standard", "lowercase", "trim"]

},

"my_analyzer2" : {

"type" : "custom",

"tokenizer" : "tokenizer1",

"filter" : ["filter1", "trim"]

}

},

"tokenizer" : {

"tokenizer1" : {

"type" : "standard",

"max_token_length" : 255

}

},

"filter" : {

"filter1" : {

"type" : "lowercase",

"language" : "greek"

}

},

"compound_format" : false,

"merge" : {

"policy" : {

"max_merge_at_once" : 10,

"segments_per_tier" : 20

}

},

"refresh_interval" : "1s",

"term_index_interval" : 1,

"store" : {

"type" : "mmapfs",

"compress" : {

"stored" : true,

"tv" : true

}

}'

- 위 표에 없는 설정에 대해서만 기술 합니다.

number_of_shards	색인 파일에 대한 shard 수
index.analysis.analyzer .default .default_index .default_search .my_analyzer1 .my_analyzer2 .tokenizer .filter	색인 및 검색 시 사용할 분석기를 등록함 .default* 은 기본 분석기를 등록 index/type 에 대한 기본 설정으로 동작 .my_analyzer* 은 사용자 정의 분석기 .tokenizer 와 .filter 는 analyzer 에서 사용하게 될 tokenizer 와 filter 를 정의
index.compound_format	파일 기반 저장 시스템을 사용할 경우 false 로 설정해야 더 나은 성능을 지원함
index.store.type
index.store.compress .stored .tv	색인 저장 시 압축 기능에 대한 설정 . 64KB 이하의 작은 문서에 대한 압축 효과가 좋음

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

[elasticsearch] Java API : Index

Elastic/Elasticsearch 2013. 4. 8. 18:42

본 문서는 개인적인 테스트와 elasticsearch.org 그리고 community 등을 참고해서 작성된 것이며,

정보 교환이 목적입니다.

잘못된 부분에 대해서는 지적 부탁 드립니다.

(예시 코드는 성능 및 보안 검증이 되지 않았습니다.)

[elasticsearch java api 리뷰]

원문 링크 : http://www.elasticsearch.org/guide/reference/java-api/index_/

json document 를 생성하는 몇 가지 방법을 설명하고 있습니다.

There are different way of generating JSON document:

- Manually (aka do it yourself) using native byte[] or as a String

- Using Map that will be automatically converted to its JSON equivalent

- Using a third party library to serialize your beans such as Jackson

- Using built-in helpers XContentFactory.jsonBuilder()

위 방법들 중에서 제일 아래 elasticsearch helper 를 이용한 방법을 테스트해 봅니다.

우선 간단하게 index 와 index type 을 정의해 보도록 하겠습니다.

curl -XPUT 'http://localhost:9200/facebook' -d '{

"settings" : {

"number_of_shards" : 5,

"number_of_replicas" : 1

},

"mappings" : {

"post" : {

"properties" : {

"docid" : { "type" : "string", "store" : "yes", "index" : "not_analyzed" },

"title" : { "type" : "string", "store" : "yes", "index" : "analyzed", "term_vector" : "yes", "analyzer" : "standard" }

}

}'

- index 는 facebook 으로 생성을 하고

- index type 은 post 라고 생성을 합니다.

- settings 와 mappings 에 대한 상세한 내용은 아래 링크 참고 하시기 바랍니다.

http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index/

http://www.elasticsearch.org/guide/reference/index-modules/

http://www.elasticsearch.org/guide/reference/mapping/

http://www.elasticsearch.org/guide/reference/mapping/core-types/

index, index type 생성이 끝났으면 이제 색인을 해보도록 하겠습니다

// 생성할 문서가 아래와 같다고 가정

// curl -XPUT 'http://localhost:9200/facebook/post/1' -d '{ "docid" : "henry", "title" : "This is the elasticsearch hadoop test." }'

// curl -XPUT 'http://localhost:9200/facebook/post/2' -d '{ "docid" : "henry", "title" : "elasticsearch test." }'

// curl -XPUT 'http://localhost:9200/facebook/post/3' -d '{ "docid" : "howook", "title" : "hadoop test." }'

// curl -XPUT 'http://localhost:9200/facebook/post/4' -d '{ "docid" : "howook", "title" : "test." }'

IndexRequestBuilder requestBuilder;

IndexResponse response;

requestBuilder = client.prepareIndex("facebook", "post");

// setSource parameter 로 json string 형태로 등록

requestBuilder.setId("1");

requestBuilder.setSource("{ \"docid\" : \"henry\", \"title\" : \"This is the elasticsearch hadoop test.\" }");

response = requestBuilder.execute().actionGet();

// XContentBuilder 로 setSource 전달

XContentBuilder jsonBuilderDocument = jsonBuilder().startObject();

jsonBuilderDocument.field("docid", "henry");

jsonBuilderDocument.field("title", "elasticsearch test.");

jsonBuilderDocument.endObject();

requestBuilder.setId("2");

requestBuilder.setSource(jsonBuilderDocument);

response = requestBuilder.execute().actionGet();

- IndexRequestBuilder 의 setSource 에 대한 코드를 보시면 어떤 arguments 받는지 알 수 있습니다.

- 그리고 문서 색인에 사용되는 여러가지 다양항 옵션들은 아래 링크를 참고 하시기 바랍니다.

http://www.elasticsearch.org/guide/reference/api/index_/

아래는 index 생성 시 필요한 settings 와 mappings 에 대한 예제 코드 입니다.

맛보기 참고용 입니다.

IndicesAdminClient indices = client.admin().indices();

CreateIndexRequest indexRequest = new CreateIndexRequest("INDEX_NAME");

indexRequest

.settings(jsonBuilderIndexSetting)

.mapping("INDEX_TYPE_NAME", jsonBuilderIndiceSetting);

indices.create(indexRequest).actionGet();

- INDEX_NAME 은 생성한 index

- INDEX_TYPE_NAME 은 생성한 index type

- jsonBuilerIndexSetting 과 jsonBuilderIndiceSetting 은 XContentBuilder 객체

:

jjeong 실무 예제로 배우는 Elasticsearch 검색엔진

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

jjeong

'settings'에 해당되는 글 8건

[Elasticsearch] Settings/Mappings 테스트 템플릿

[Elasticsearch] 그냥 settings, mappings, template 예제

[Elasticsearch] Synonym 적용을 위한 Index Settings 설정 예시

[Elasticsearch] Create Index Settings & Mappings 템플릿.

[Elasticsearch - The Definitive Guide] Index Settings

[elasticsearch] settings & mappings 샘플용 코드...

[elasticsearch] Java API : settings property.

[elasticsearch] Java API : Index

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역