'_source' 태그의 글 목록

[Elasticsearch] doc, _source, stored_fields, script_fields 간단 정리

Elastic/Elasticsearch 2017. 7. 13. 14:42

§ Elasticsearch Reference 5.5.0

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-source-filtering.html

1. doc

가장 빠름

한번 읽힌 정보는 memory에 cache 됨

single term field 또는 not analyzed field 에 대해서

2. _source

매우 느림

매번 파싱하고 읽어 들임

3. stored_fields

field mapping 시 stored 설정이 된 field만 사용이 가능 함

역시 느림

[Elasticsearch] Scripting 을 이용한 Field Value 조작하기.

Elastic/Elasticsearch 2015. 8. 17. 14:59

정확하게 조작한다기 보다 저장된 document 의 field value 를 이용한다가 맞을 것 같습니다.

기본 참조 문서는 아래와 같습니다.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html

문서를 보셨으면 아시겠지만 기본 groovy 를 사용하게 되어 있습니다.

하지만 보안 이슈로 설정이 기본 false 입니다.

아래와 같이 설정을 변경해 주셔야 사용이 가능 합니다.

[Scripting Enable]

script.inline: on

또는

script.groovy.sandbox.enabled: true

※ 관련 설정에 대한 자세한 내용은 문서를 참고 하시면 됩니다.

아래는 문서 내용 snippet 입니다.

Value	Description
`off`	scripting is turned off completely, in the context of the setting being set.
`on`	scripting is turned on, in the context of the setting being set.
`sandbox`	scripts may be executed only for languages that are sandboxed

그리고 추후 2.0.0 에서는 위 설정에서 groovy sandbox 는 deprecated 예정이니 참고하세요.

(https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html#_groovy_sandboxing)

설정을 해주셨으니 이제 value 에 접근하는 방법에 대해서 알아 보겠습니다.

문서에서는 총 3 가지 방법을 제시해 주고 있습니다.

[Value 접근]

# Document fields

_doc['field_name'].value or values

# Stored fields

_fields['field_name'].value

# Source field

_source['field_name']

※ 여기서 _fields 는 store 옵션을 true 로 하지 않으셨다면 에러가 발생 합니다.

value에 접근 하는 방법을 아셨으니 이 값을 이용한 연산을 어떻게 하는지도 궁금하실 겁니다.

위에서 말씀 드렸듯이 기본적인 language 는 groovy 입니다.

그렇기 때문에 groovy script 를 이용해서 연산 및 처리를 하시면 되겠습니다.

아래는 substring에 대한 커뮤니티의 질문에 대한 예제 코드 입니다.

[script_fields 샘플코드]

GET /test_index/_search

{

"script_fields": {

"field1_substring": {

"script": "_doc['field1'].value"

}

GET /test_index/_search

{

"script_fields": {

"field1_substring": {

"script": "_fields['field1'].value.substring(0, 5)"

}

GET /test_index/_search

{

"script_fields": {

"field1_substring": {

"script": "_source.field1.substring(3, 10)"

}

※ script_fields 를 이용하게 되면 return 시 fields를 통해서 정보가 전달 됩니다. (즉, 필요한 field 가 있다면 추가로 선언해 주셔야 한다는 이야기 입니다.)

※ QueryDSL 을 보시면 아시겠지만 기본 query 구문이 없으면 match_all 로 동작 합니다.

※ 추가적으로 주의 할 점은 field 속성이 not_analyzed 인지 analyzed, store 인지 아닌지 확인 하시고 테스트 하시기 바랍니다.

저작자표시 비영리 변경금지 (새창열림)

:

Using Elasticsearch for logs - 활용 팁.

Elastic/Elasticsearch 2013. 1. 16. 13:20

원문 : http://www.elasticsearch.org/tutorials/2012/05/19/elasticsearch-for-logging.html
번역 : http://socurites.com/122

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

http://www.elasticsearch.org/guide/reference/mapping/source-field.html

http://www.elasticsearch.org/guide/reference/mapping/all-field.html

http://www.elasticsearch.org/guide/reference/query-dsl/

http://www.elasticsearch.org/guide/reference/api/bulk.html

아무래도 이상해서 더 찾아 봤습니다. ㅋㅋ
http://www.elasticsearch.org/guide/reference/index-modules/store.html
이 문서를 보면 일단 _all 과 _source 는 reserved keyword 같구요. (소스 보기 귀찮아서 상상만.. )
문서 보고 store 옵션을 줘서 처리 했습니다.
결과는 ㅎㅎ 성공 ^^

"settings" : {
"number_of_shards" : 50,
"number_of_replicas" : 1,
"index" : {
"refresh_interval" : "1s",
"term_index_interval" : "1",
"store" : {
"compress" : {
"stored" : true,
"tv" : true
}
}
},

"mappings" : {
"type명" : {
"properties" : {
"docid" : { "type" : "string", "store" : "yes", "index" : "not_analyzed", "include_in_all" : false },
"seq" : { "type" : "long", "store" : "yes", "index" : "no", "include_in_all" : false }
}
}
}
}

그리고 _all 에 대해서는 보시는 것 처럼 include_in_all : false 를 해서 _all 로는 어떤것도 매칭이 되지 않습니다.
이건 젤 위 문서에서 all-field.html 보시면 되겠습니다.
참고하시라고 압축율은 무려 80% 나 되내요.. ㅎㅎ

위 링크들 보고 열심히 튜닝 중이긴 한데.. 이게 효과가 있는건지 없는건지 알수가 없내요.. ㅡ.ㅡ;;
(_all, all, _source, source 이건 setting 할때 둘다 적용 되더라구요.)

"all" : {
"enabled" : false
},

"source" : {
"enabled" : true,
}

용도에 따라 disk 용량을 효율적으로 사용하기 위해서 위와 같은 구성을 했는데.. 흠.. 일단 용량은 조금 줄었는데.. 좀 큰 사이즈로 한번 돌려봐야 겠내요.

데이터 건수가 적은 걸로 돌렸을 때 옵션 안주고 돌리면
- 447MB

옵션주고 돌리면
- 438MB

9MB 절약 되었습니다. ^^;

:

jjeong

'_source'에 해당되는 글 3건

[Elasticsearch] doc, _source, stored_fields, script_fields 간단 정리

[Elasticsearch] Scripting 을 이용한 Field Value 조작하기.

Using Elasticsearch for logs - 활용 팁.

티스토리툴바