[Elasticsearch] Lucene Arirang Analyzer Plugin for Elasticsearch 5.0.1
Elastic/Elasticsearch 2016. 11. 24. 19:02우선 빌드한 플러그인 zip 파일 먼저 공유 합니다.
나중에 작업한 내용에 대해서는 github 에 올리도록 하겠습니다.
요즘 프로젝트며 운영 업무가 너무 많아서 이것도 겨우 겨우 시간 내서 작업 했내요.
elasticsearch-analysis-arirang-5.0.1.zip
설치 방법)
$ bin/elasticsearch-plugin install --verbose file:///elasticsearch-analysis-arirang/target/elasticsearch-analysis-arirang-5.0.1.zip
설치 로그)
-> Downloading file:///elasticsearch-analysis-arirang-5.0.1.zip
Retrieving zip from file:///elasticsearch-analysis-arirang-5.0.1.zip
[=================================================] 100%
- Plugin information:
Name: analysis-arirang
Description: Arirang plugin
Version: 5.0.1
* Classname: org.elasticsearch.plugin.analysis.arirang.AnalysisArirangPlugin
-> Installed analysis-arirang
Elasticsearch 실행 로그)
$ bin/elasticsearch
[2016-11-24T18:49:09,922][INFO ][o.e.n.Node ] [] initializing ...
[2016-11-24T18:49:10,083][INFO ][o.e.e.NodeEnvironment ] [aDGu2B9] using [1] data paths, mounts [[/ (/dev/disk1)]], net usable_space [733.1gb], net total_space [930.3gb], spins? [unknown], types [hfs]
[2016-11-24T18:49:10,084][INFO ][o.e.e.NodeEnvironment ] [aDGu2B9] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-11-24T18:49:10,085][INFO ][o.e.n.Node ] [aDGu2B9] node name [aDGu2B9] derived from node ID; set [node.name] to override
[2016-11-24T18:49:10,087][INFO ][o.e.n.Node ] [aDGu2B9] version[5.0.1], pid[56878], build[080bb47/2016-11-11T22:08:49.812Z], OS[Mac OS X/10.12.1/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_72/25.72-b15]
[2016-11-24T18:49:11,335][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [aggs-matrix-stats]
[2016-11-24T18:49:11,335][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [ingest-common]
[2016-11-24T18:49:11,335][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [lang-expression]
[2016-11-24T18:49:11,335][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [lang-groovy]
[2016-11-24T18:49:11,335][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [lang-mustache]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [lang-painless]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [percolator]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [reindex]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [transport-netty3]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded module [transport-netty4]
[2016-11-24T18:49:11,336][INFO ][o.e.p.PluginsService ] [aDGu2B9] loaded plugin [analysis-arirang]
[2016-11-24T18:49:14,151][INFO ][o.e.n.Node ] [aDGu2B9] initialized
[2016-11-24T18:49:14,151][INFO ][o.e.n.Node ] [aDGu2B9] starting ...
[2016-11-24T18:49:14,377][INFO ][o.e.t.TransportService ] [aDGu2B9] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
[2016-11-24T18:49:17,511][INFO ][o.e.c.s.ClusterService ] [aDGu2B9] new_master {aDGu2B9}{aDGu2B9mQ8KkWCe3fnqeMw}{_y9RzyKGSvqYAFcv99HBXg}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2016-11-24T18:49:17,584][INFO ][o.e.g.GatewayService ] [aDGu2B9] recovered [0] indices into cluster_state
[2016-11-24T18:49:17,588][INFO ][o.e.h.HttpServer ] [aDGu2B9] publish_address {127.0.0.1:9200}, bound_addresses {[fe80::1]:9200}, {[::1]:9200}, {127.0.0.1:9200}
[2016-11-24T18:49:17,588][INFO ][o.e.n.Node ] [aDGu2B9] started
한글형태소분석 실행)
$ curl -X POST -H "Cache-Control: no-cache" -H "Postman-Token: 6d392d83-5816-71ad-556b-5cd6f92af634" -d '{
"analyzer" : "arirang_analyzer",
"text" : "[한국] 엘라스틱서치 사용자 그룹의 HENRY 입니다."
}' "http://localhost:9200/_analyze"
형태소분석 결과)
{
"tokens": [
{
"token": "[",
"start_offset": 0,
"end_offset": 1,
"type": "symbol",
"position": 0
},
{
"token": "한국",
"start_offset": 1,
"end_offset": 3,
"type": "korean",
"position": 1
},
{
"token": "]",
"start_offset": 3,
"end_offset": 4,
"type": "symbol",
"position": 2
},
{
"token": "엘라스틱서치",
"start_offset": 5,
"end_offset": 11,
"type": "korean",
"position": 3
},
{
"token": "엘라",
"start_offset": 5,
"end_offset": 7,
"type": "korean",
"position": 3
},
{
"token": "스틱",
"start_offset": 7,
"end_offset": 9,
"type": "korean",
"position": 4
},
{
"token": "서치",
"start_offset": 9,
"end_offset": 11,
"type": "korean",
"position": 5
},
{
"token": "사용자",
"start_offset": 12,
"end_offset": 15,
"type": "korean",
"position": 6
},
{
"token": "그룹",
"start_offset": 16,
"end_offset": 18,
"type": "korean",
"position": 7
},
{
"token": "henry",
"start_offset": 20,
"end_offset": 25,
"type": "word",
"position": 8
},
{
"token": "입니다",
"start_offset": 26,
"end_offset": 29,
"type": "korean",
"position": 9
}
]
}