'Stack'에 해당되는 글 13건

  1. 2020.06.19 [Elasticsearch] X-pack Security API Key 사용 해 보기
  2. 2020.02.28 [Logstash] CSV 파일 밀어 넣기
  3. 2019.11.07 [Logstash] 최적화 설정 정보
  4. 2019.11.06 [Logstash] logstash filter date 조금 알아보기
  5. 2019.11.05 [Elastic] 목차 입니다.
  6. 2019.11.04 [Logstash] JSON filter plugin
  7. 2019.10.23 [Elasticsearch] Cache 에 대해 알아 봅시다.
  8. 2019.10.17 [Elasticsearch] 앱 내 사용자 행동로그 수집 파이프라인 구성
  9. 2019.10.17 [Elasticsearch] minimum_master_nodes is not working.
  10. 2019.10.02 [Elasticsearch] Elasticsearch Cluster Auto Scaling 구성 하기.

[Elasticsearch] X-pack Security API Key 사용 해 보기

Elastic/Elasticsearch 2020. 6. 19. 11:07

Elastic Stack 이 좋은 이유는 기본 Basic license 까지 사용이 가능 하다는 것입니다.

사실 이것 말고도 엄청 많죠 ㅎㅎ 

 

https://www.elastic.co/subscriptions

 

딱 API keys management 까지 사용이 됩니다. ㅎㅎㅎ

 

먼저 사용하기에 앞서서 Elasticsearch 와 Kibana 에 x-pack 사용을 위한 설정을 하셔야 합니다.

 

[Elasticsearch]

- elasticsearch.yml

xpack.monitoring.enabled: true
xpack.ml.enabled: true
xpack.security.enabled: true

xpack.security.authc.api_key.enabled: true
xpack.security.authc.api_key.hashing.algorithm: "pbkdf2"
xpack.security.authc.api_key.cache.ttl: "1d"
xpack.security.authc.api_key.cache.max_keys: 10000
xpack.security.authc.api_key.cache.hash_algo: "ssha256"

위 설정은 기본이기 때문에 환경에 맞게 최적화 하셔야 합니다.

https://www.elastic.co/guide/en/elasticsearch/reference/7.8/security-settings.html#api-key-service-settings

 

[Kibana]

- kibana.yml

xpack:
  security:
    enabled: true
    encryptionKey: "9c42bff2e04f9b937966bda03e6b5828"
    session:
      idleTimeout: 600000
    audit:
      enabled: true

 

이렇게 설정 한 후 id/password 설정을 하시면 됩니다.

 

# bin/elasticsearch-setup-passwords interactive
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y

Enter password for [elastic]:
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana]:
Reenter password for [kibana]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]:
Changed password for user [apm_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]

Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

 

이렇게 설정이 끝나면 kibana 에 접속해서 API key 를 생성 하시면 됩니다.

아래 문서는 생성 시 도움이 되는 문서 입니다.

 

www.elastic.co/guide/en/elasticsearch/reference/current/security-privileges.html

www.elastic.co/guide/en/elasticsearch/reference/7.7/security-api-put-role.htmlwww.elastic.co/guide/en/elasticsearch/reference/7.7/defining-roles.htmlwww.elastic.co/guide/en/elasticsearch/reference/7.7/security-api-create-api-key.html

 

Kibana Console 에서 아래와 같이 생성이 가능 합니다.

POST /_security/api_key
{
  "name": "team-index-command",
  "expiration": "10m", 
  "role_descriptors": { 
    "role-team-index-command": {
      "cluster": ["all"],
      "index": [
        {
          "names": ["*"],
          "privileges": ["all"]
        }
      ]
    }
  }
}

{
  "id" : "87cuynIBjKAXtnkobGgo",
  "name" : "team-index-command",
  "expiration" : 1592529999478,
  "api_key" : "OlVGT_Q8RGq1C_ASHW7pGg"
}

생성 이후 사용을 위해서는 

 

- ApiKey 는 id:api_key 를 base64 인코딩 합니다.

base64_encode("87cuynIBjKAXtnkobGgo"+":"+"OlVGT_Q8RGq1C_ASHW7pGg")
==> VGVVOXluSUJHUUdMaHpvcUxDVWo6aUtfSmlEMmdSMy1FUUFpdENCYzF1QQ==
curl -H 
  "Authorization: ApiKey VGVVOXluSUJHUUdMaHpvcUxDVWo6aUtfSmlEMmdSMy1FUUFpdENCYzF1QQ==" 
  http://localhost:9200/_cluster/health

이제 용도와 목적에 맞춰서 API key 를 만들고 사용 하시면 되겠습니다.

 

Trackbacks 0 : Comments 0

Write a comment


[Logstash] CSV 파일 밀어 넣기

Elastic/Logstash 2020. 2. 28. 18:35

전에 그냥 문서 링크만 걸었었는데 혹시 샘플 코드가 필요 하신 분들도 있을 수 있어서 기록해 봅니다.

 

[config/logstash-csv.conf]

input {
  file {
      path => ["/Users/henryjeong/Works/poc/elastic/data/*.csv"]
      start_position => "beginning"
  }
}

filter {
    csv {
        separator => ","
        columns => ["title", "cat1", "cat2", "cat3", "area", "sigungu", "description"]
    }

    mutate { rename => ["title", "tt"] }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "csv-%{+YYYY.MM.dd}"
  }
}

[*.csv]

title,cat1,cat2,cat3,area,sigungu,description
엄마손충무김밥,음식,음식점,한식,경상남도,통영시,"KBS ""1박2일"" 욕지도 복불복에 추천된 충무김밥집이다."
평창동의 봄,음식,음식점,한식,서울,종로구,"평창동의 명소, 평창동 언덕마을에 유럽풍의 아름다운 음식점 & 카페. 1층에는 커피는 물론 계절빙수, 단팥죽, 호박죽, 허브티를 파는 카페, 2층에는 식당이 있다. 2층 식당에서는 평창동 봄만의 특별한 한정식 코스 요리와 웰빙 삼계탕 등의 음식을  선보이고 있다. 3층은 최대 40명수용의 연회룸이 있고, 갤러리 전시와 건강교실도 운영하고 있다. "

음봉가든,음식,음식점,한식,경기도,가평군,"음봉가든 (구, 진짜네집)은 자연산 민물고기만을 고집하는 50년 전통의 진정한 매운탕전문점이다. 주재료로 쓰이고 있는 메기 및 쏘가리, 빠가사리
등은 주인이 직접 낚아 올린 물고기로 매우 신선하며 매운탕 역시 얼큰하고 개운하다. 또한 집앞에서 직접 재배하고 키우는 야채와 채소는 매운탕의 국물맛을 더욱 맛나게 해준다."

대자골토속음식,음식,음식점,한식,경기도,고양시,경기 북부지방에선 국수나 수제비를 넣어 국물을 넉넉하게 만든 음식을 '털레기'라 하는데 이곳은 '미꾸라지털레기'를 주메뉴로 하는 30여년
내력의 경기 북부지방 대표향토음식점이다. 주인이 직접 야채와 채소를 재배하는데 고춧가루까지도 직접 재배한 것을 사용한다고하니 그 사명감이 대단하다 할 수 있겠다. 통미꾸라지매운탕의 옛맛을 찾는 사람이라면 꼭 한번은 들려봐야 전통 맛집이다.

장수촌,음식,음식점,한식,경기도,광명시,"부드러운 닭고기 육질에 구수한 누룽지가 함께 하는 '누룽지삼계탕'. 거기다 잘 익은 김치 한 점 더 한다면 그 어떤 맛도 부러울 것이 없다. 식사메뉴인 '누룽지삼계탕'과 '쟁반막국수', 안주메뉴인 골뱅이무침 메뉴가 전부인 '장수촌'은 경기도 광명의 대표맛집으로 토종닭 선별부터 양념 재료 하나 하나까지 일일이 주인이 직접 선별하는 그 정성과 끈기가 맛의 비결이라 할 수 있겠다."
청기와뼈다귀해장국,음식,음식점,한식,경기도,부천시,"부천 사람이라면 모르는 이 없을 정도로 유명한 집이다. 돼지고기와 국물에서 냄새가 안나 여성이나 어린이들 특히 어르신들 보양식으로
도 입소문이 난 곳인데, 부재료 보다는 뼈다귀로 양을 채우는 그 푸짐함 또한 그 소문이 자자하다. 양질의 돼지 뼈에 사골 국물과 우거지를 넣어 맛을 낸 뼈다귀 해장국. 그 맛을 제대로 만날 수 있는 대표적인 음식점이다. "

일번지,음식,음식점,한식,경기도,성남시,닭요리의 으뜸이라 해도 과언이 아닌 '남한산성 닭죽촌민속마을' 에서 남한산성 등산후 가장 많이 찾는 집 중에 한 집이다. 특히 이집은 닭백숙 이외에 '닭도가니'로도 유명한데 이집의 '도가니'는 소의 도가니를 뜻 하는 것이 아니라 장독대의 '독'에서 유래된 말로 '독'에 밥과 닭과 여러보약재를 넣어 만드는 것으로 그 맛과 영양면에서 닭요리 중에 최고라 할 수 있겠다.

장금이,음식,음식점,한식,경기도,시흥시,"경기도 시흥의 지역특산물 중의 하나가 바로 '연'이다. 연은 수련과에 속하는 다년생 수생식물로 뿌리채소로는 드물게 다량의 비타민과 무기질을 함유하고 있어 최근 건강식 식품원료로 각광받고 있으나 그 효능에 비해 다양한 조리방법이 개발되어 있지 않아 흔히 '연'하면 '연근' 반찬 이외엔 생각나는 것이 없는데, 물왕동 연요리전문점 '장금이'를 찾으면 그렇지 않음을 직접 확인 할 수 있다. 흔치 않은 색다른 한정식을 원한다면 한 번쯤은 꼭 한 번 들러 연밥정식과 연잎수육을 맛 봐야 할 곳이다. "

안성마춤갤러리,음식,음식점,한식,경기도,안성시,경기도 안성의 농산물 브랜드 '안성마춤'을 내세워 만든 고품격 갤러리풍 식당으로 각종 공연과 작품전시회 감상과 동시에 농협에서 직접 운영하는 특등급 안성한우를 맞볼 수 있는 곳으로 유명한 집이다. 특히 안성마춤한우 중에 10%만 생산된다는 슈프림급 한우는 늦어도 하루 전에는 꼭 예약을 해야 그 맛을 볼 수 있다 하여 그 희소성에 더더욱 인기가 높다.

언덕너머매운탕,음식,음식점,한식,경기도,연천군,민물고기 중에 살이 탱탱하고 쫄깃한 맛으로 매운탕 재료 중에 으뜸이라 불리우는 '쏘가리'를 메인메뉴로 자랑하는 이집의 '쏘가리매운탕'은
임진강에서 직접 잡아올린 자연 그대로의 그 담백하고 칼칼한 맛이 일품이라 할 수 있다.

보기 좋으라고 개행을 추가 했습니다.

실제 개행 없이 들어 있습니다.

위 데이터는 공공데이터에서 제가 추려 온 데이터 입니다.

 

위 데이터에서는 Datatype 에 대한 변환을 고민 하지 않아도 되지만 필요한 경우가 있을 수도 있습니다.

공식문서)

https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html#plugins-filters-csv-convert

 

convert

  • Value type is hash
  • Default value is {}

Define a set of datatype conversions to be applied to columns. Possible conversions are integer, float, date, date_time, boolean

Example:

    filter {
      csv {
        convert => {
          "column1" => "integer"
          "column2" => "boolean"
        }
      }
    }

keyword 나 text 는 지원 하지 않으며 지원하는 datatype 은 integer, float, date, date_time, boolean 입니다.

 

csv 파일을 밀어 넣을 때 주의 하셔야 하는 점은)

- input 에서 codec 으로 csv 를 지정 하시게 되면 column 명 지정이 원하는 데로 되지 않습니다.

- filter 에서 처리를 하셔야 정상적으로 column 명이 field 명으로 들어 가게 됩니다.

- input codec csv 와 filter 모두 설정 안하게 되면 그냥 message field 에 row 단위로 들어 가게 됩니다.

 

mutate { rename => ["title", "tt"] }

- 이건 뭔지 딱 보셔도 아시겠죠?

- column 명을 title 에서 tt 로 변경 해서 field 로 생성 되게 됩니다.

 

Trackbacks 0 : Comments 0

Write a comment


[Logstash] 최적화 설정 정보

Elastic/Logstash 2019. 11. 7. 15:00

공식 문서에 잘 나와 있습니다.

https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html

https://www.elastic.co/guide/en/logstash/current/performance-tuning.html

 

기본적으로 아래 두 개 설정만 잘 세팅 하셔도 성능 뽑아 낼 수 있습니다.

 

pipeline.workers)

이 설정 값은 그냥 기본으로 Core 수 만큼 잡아 주고 시작 하시면 됩니다.

 

pipeline.batch.size)

Worker thread 가 한 번에 처리 하기 위한 이벤트의 크기 입니다.
최적 크기는 직접 구하셔야 합니다.

결국 Elasticsearch 로 Bulk Request 를 보내기 위한 최적의 크기로 설정 한다고 보시면 됩니다.

 

이외 더 봐주시면 좋은 건

- CPU

- MEM

- I/O (Disk, Network)

- JVM Heap

Trackbacks 0 : Comments 0

Write a comment


[Logstash] logstash filter date 조금 알아보기

Elastic/Logstash 2019. 11. 6. 14:34

문의가 들어 왔습니다.

여러 필드에 대해서 date format 이 다른데 어떻게 적용을 해야 하나요?

 

그래서 소스코드를 열어 보고 아래와 같이 해보라고 했습니다.

date {
...
}

date {
...
}

결국 date {...} 를 필드 별로 선언을 해주면 되는 내용입니다.

 

공식 문서)

https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html

 

Common Options)

https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html#plugins-filters-date-common-options

 

Date Filter Configuration Options)

Setting

Input type

Required

locale

string

No

match

array

No

tag_on_failure

array

No

target

string

No

timezone

string

No

전체 옵션이 필수가 아니긴 합니다.

그래도 꼭 아셔야 하는 설정은 match, target 입니다.

 

- match 의 첫 번째 값은 field 명이고, 그 이후는 format 들이 되겠습니다.

(공식 문서에 잘 나와 있습니다.)

 

An array with field name first, and format patterns following, [ field, formats... ]

If your time field has multiple possible formats, you can do this:

 

match => [ "logdate", 

    "MMM dd yyyy HH:mm:ss", 

    "MMM d yyyy HH:mm:ss", 

    "ISO8601" ]

 

- target 은 지정을 하지 않게 되면 기본 @timestamp 필드로 설정이 됩니다. 변경 하고자 하면 target 에 원하시는 field name 을 넣으시면 됩니다.

 

예제)

date {
    match => ["time" , "yyyy-MM-dd'T'HH:mm:ssZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZ"]
    target => "@timestamp"
}

date {
    match => ["localtime" , "yyyy-MM-dd HH:mm:ssZ"]
    target => "time"
}

DateFilter.java)

더보기
/*
 * Licensed to Elasticsearch under one or more contributor
 * license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright
 * ownership. Elasticsearch licenses this file to you under
 * the Apache License, Version 2.0 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.logstash.filters;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.joda.time.Instant;
import org.logstash.Event;
import org.logstash.ext.JrubyEventExtLibrary.RubyEvent;
import org.logstash.filters.parser.CasualISO8601Parser;
import org.logstash.filters.parser.JodaParser;
import org.logstash.filters.parser.TimestampParser;
import org.logstash.filters.parser.TimestampParserFactory;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class DateFilter {
  private static Logger logger = LogManager.getLogger();
  private final String sourceField;
  private final String[] tagOnFailure;
  private RubyResultHandler successHandler;
  private RubyResultHandler failureHandler;
  private final List<ParserExecutor> executors = new ArrayList<>();
  private final ResultSetter setter;

  public interface RubyResultHandler {
    void handle(RubyEvent event);
  }

  public DateFilter(String sourceField, String targetField, List<String> tagOnFailure, RubyResultHandler successHandler, RubyResultHandler failureHandler) {
    this(sourceField, targetField, tagOnFailure);
    this.successHandler = successHandler;
    this.failureHandler = failureHandler;
  }

  public DateFilter(String sourceField, String targetField, List<String> tagOnFailure) {
    this.sourceField = sourceField;
    this.tagOnFailure = tagOnFailure.toArray(new String[0]);
    if (targetField.equals("@timestamp")) {
      this.setter = new TimestampSetter();
    } else {
      this.setter = new FieldSetter(targetField);
    }
  }

  public void acceptFilterConfig(String format, String locale, String timezone) {
    TimestampParser parser = TimestampParserFactory.makeParser(format, locale, timezone);
    logger.debug("Date filter with format={}, locale={}, timezone={} built as {}", format, locale, timezone, parser.getClass().getName());
    if (parser instanceof JodaParser || parser instanceof CasualISO8601Parser) {
      executors.add(new TextParserExecutor(parser, timezone));
    } else {
      executors.add(new NumericParserExecutor(parser));
    }
  }

 public List<RubyEvent> receive(List<RubyEvent> rubyEvents) {
    for (RubyEvent rubyEvent : rubyEvents) {
      Event event = rubyEvent.getEvent();

      switch (executeParsers(event)) {
        case FIELD_VALUE_IS_NULL_OR_FIELD_NOT_PRESENT:
        case IGNORED:
          continue;
        case SUCCESS:
          if (successHandler != null) {
            successHandler.handle(rubyEvent);
          }
          break;
        case FAIL: // fall through
        default:
          for (String t : tagOnFailure) {
            event.tag(t);
          }
          if (failureHandler != null) {
            failureHandler.handle(rubyEvent);
          }
      }
    }
    return rubyEvents;
  }

  public ParseExecutionResult executeParsers(Event event) {
    Object input = event.getField(sourceField);
    if (event.isCancelled()) { return ParseExecutionResult.IGNORED; }
    if (input == null) { return ParseExecutionResult.FIELD_VALUE_IS_NULL_OR_FIELD_NOT_PRESENT; }

    for (ParserExecutor executor : executors) {
      try {
        Instant instant = executor.execute(input, event);
        setter.set(event, instant);
        return ParseExecutionResult.SUCCESS;
      } catch (IllegalArgumentException | IOException e) {
        // do nothing, try next ParserExecutor
      }
    }
    return ParseExecutionResult.FAIL;
  }
}

 

Trackbacks 0 : Comments 0

Write a comment


[Elastic] 목차 입니다.

Elastic 2019. 11. 5. 14:52

Elastic Stack 의 Reference 목차 입니다.

이걸 왜 한 장으로 정리를 했냐면 목차만 잘 찾아 봐도 해결 방법이 어딨는지 어떤 기능을 제공 하고 있는지 쉽게 알수 있습니다.

(In my case!!)

 

그래서 혼자 보기 아까워서 그냥 올려봤습니다.

Elastic Stack References)

1. Elasticsearch

2. Logstash

3. Kibana

4. Beats Platform

5. Beats Developer Guide

6. Filebeat

더보기
  1. Elasticsearch

    1. Elasticsearch introduction

      1. Data in: documents and indices

      2. Information out: search and analyze

      3. Scalability and resilience

    2. Getting Started with Elasticsearch

      1. Get Elasticsearch up and running

      2. Index some documents

      3. Start searching

      4. Analyze results with aggregations

      5. Where to go from here

    3. Set up Elasticsearch

      1. Installing Elasticsearch

        1. Install Elasticsearch from archive on Linux or MacOS

        2. Install Elasticsearch with .zip on Windows

        3. Install Elasticsearch with Debian Package

        4. Install Elasticsearch with RPM

        5. Install Elasticsearch Windows MSI Installer

        6. Install Elasticsearch with Docker

        7. Install Elasticsearch on macOS with Homebrew

      2. Configuring Elasticsearch

        1. Setting JVM options

        2. Secure settings

        3. Logging configuration

        4. Auditing settings

        5. Cross-cluster replication settings

        6. Transforms settings

        7. Index lifecycle management settings

        8. License settings

        9. Machine learning settings

        10. Security settings

        11. SQL access settings

        12. Watcher settings

      3. Important Elasticsearch configuration

        1. path.data and path.logs

        2. cluster.name

        3. node.name

        4. network.host

        5. Discovery and cluster formation settings

        6. Setting the heap size

        7. JVM heap dump path

        8. GC logging

        9. Temp directory

        10. JVM fatal error logs

      4. Important System Configuration

        1. Configuring system settings

        2. Disable swapping

        3. File Descriptors

        4. Virtual memory

        5. Number of threads

        6. DNS cache settings

        7. JNA temporary directory not mounted with noexec

      5. Bootstrap Checks

        1. Heap size check

        2. File descriptor check

        3. Memory lock check

        4. Maximum number of threads check

        5. Max file size check

        6. Maximum size virtual memory check

        7. Maximum map count check

        8. Client JVM check

        9. Use serial collector check

        10. System call filter check

        11. OnError and OnOutOfMemoryError checks

        12. Early-access check

        13. G1GC check

        14. All permission check

        15. Discovery configuration check

      6. Starting Elasticsearch

      7. Stopping Elasticsearch

      8. Adding nodes to your cluster

      9. Set up X-Pack

      10. Configuring X-Pack Java Clients

      11. Bootstrap Checks for X-Pack

    4. Upgrade Elasticsearch

      1. Rolling upgrades

      2. Full cluster restart upgrade

      3. Reindex before upgrading

        1. Reindex in place

        2. Reindex from a remote cluster

    5. Aggregations

      1. Metrics Aggregations

        1. Avg Aggregation

        2. Weighted Avg Aggregation

        3. Cardinality Aggregation

        4. Extended Stats Aggregation

        5. Geo Bounds Aggregation

        6. Geo Centroid Aggregation

        7. Max Aggregation

        8. Min Aggregation

        9. Percentiles Aggregation

        10. Percentile Ranks Aggregation

        11. Scripted Metric Aggregation

        12. Stats Aggregation

        13. Sum Aggregation

        14. Top Hits Aggregation

        15. Value Count Aggregation

        16. Median Absolute Deviation Aggregation

      2. Bucket Aggregations

        1. Adjacency Matrix Aggregation

        2. Auto-interval Date Histogram Aggregation

        3. Children Aggregation

        4. Composite Aggregation

        5. Date Histogram Aggregation

        6. Date Range Aggregation

        7. Diversified Sampler Aggregation

        8. Filter Aggregation

        9. Filters Aggregation

        10. Geo Distance Aggregation

        11. GeoHash grid Aggregation

        12. GeoTile Grid Aggregation

        13. Global Aggregation

        14. Histogram Aggregation

        15. IP Range Aggregation

        16. Missing Aggregation

        17. Parent Aggregation

        18. Range Aggregation

        19. Rare Terms Aggregation

        20. Reverse nested Aggregation

        21. Sampler Aggregation

        22. Significant Terms Aggregation

        23. Significant Text Aggregation

        24. Terms Aggregation

        25. Subtleties of bucketing range fields

      3. Pipeline Aggregations

        1. Avg Bucket Aggregation

        2. Derivative Aggregation

        3. Max Bucket Aggregation

        4. Min Bucket Aggregation

        5. Sum Bucket Aggregation

        6. Stats Bucket Aggregation

        7. Extended Stats Bucket Aggregation

        8. Percentiles Bucket Aggregation

        9. Moving Average Aggregation

        10. Moving Function Aggregation

        11. Cumulative Sum Aggregation

        12. Cumulative Cardinality Aggregation

        13. Bucket Script Aggregation

        14. Bucket Selector Aggregation

        15. Bucket Sort Aggregation

        16. Serial Differencing Aggregation

      4. Matrix Aggregations

        1. Matrix Stats

      5. Caching heavy aggregations

      6. Returning only aggregations

      7. Aggregation Metadata

      8. Returning the type of the aggregation

    6. Query DSL

      1. Query and filter context

      2. Compound queries

        1. Boolean

        2. Boosting

        3. Constant score

        4. Disjunction score

        5. Function score

      3. Full text queries

        1. Intervals

        2. Match

        3. Match boolean prefix

        4. Match phrase

        5. Match phrase prefix

        6. Multi-match

        7. Common Terms Query

        8. Query String

        9. Simple query string

      4. Geo queries

        1. Geo-bounding box

        2. Geo-distance

        3. Geo-polygon

        4. Geo-shape

      5. Shape queries

        1. Shape

      6. Joining queries

        1. Nested

        2. Has child

        3. Has parent

        4. Parent ID

      7. Match all

      8. Span queries

        1. Span containing

        2. Span field masking

        3. Span first

        4. Span multi-term

        5. Span near

        6. Span not

        7. Span or

        8. Span term

        9. Span within

      9. Specialized queries

        1. Distance feature

        2. More like this

        3. Percolate

        4. Rank feature

        5. Script

        6. Script score

        7. Wrapper

        8. Pinned Query

      10. Term-level queries

        1. Exists

        2. Fuzzy

        3. IDs

        4. Prefix

        5. Range

        6. Regexp

        7. Term

        8. Terms

        9. Terms set

        10. Type Query

        11. Wildcard

      11. minimum_should_match parameter

      12. rewrite parameter

      13. Regular expression syntax

    7. Search across clusters

    8. Scripting

      1. How to use scripts

      2. Accessing document fields and special variables

      3. Scripting and security

      4. Painless scripting language

      5. Lucene expressions language

      6. Advanced scripts using script engines

    9. Mapping

      1. Removal of mapping types

      2. Field datatypes

        1. Alias

        2. Arrays

        3. Binary

        4. Boolean

        5. Date

        6. Date nanoseconds

        7. Dense vector

        8. Flattened

        9. Geo-point

        10. Geo-shape

        11. IP

        12. Join

        13. Keyword

        14. Nested

        15. Numeric

        16. Object

        17. Percolator

        18. Range

        19. Rank feature

        20. Rank features

        21. Search-as-you-type

        22. Sparse vector

        23. Text

        24. Token count

        25. Shape

      3. Meta-Fields

        1. _field_names field

        2. _ignored field

        3. _id field

        4. _index field

        5. _meta field

        6. _routing field

        7. _source field

        8. _type field

      4. Mapping parameters

        1. analyzer

        2. normalizer

        3. boost

        4. coerce

        5. copy_to

        6. doc_values

        7. dynamic

        8. enabled

        9. eager_global_ordinals

        10. fielddata

        11. format

        12. ignore_above

        13. ignore_malformed

        14. index

        15. index_options

        16. index_phrases

        17. index_prefixes

        18. fields

        19. norms

        20. null_value

        21. position_increment_gap

        22. properties

        23. search_analyzer

        24. similarity

        25. store

        26. term_vector

      5. Dynamic Mapping

        1. Dynamic field mapping

        2. Dynamic templates

    10. Analysis

      1. Anatomy of an analyzer

      2. Testing analyzers

      3. Analyzers

        1. Configuring built-in analyzers

        2. Fingerprint Analyzer

        3. Keyword Analyzer

        4. Language Analyzers

        5. Pattern Analyzer

        6. Simple Analyzer

        7. Standard Analyzer

        8. Stop Analyzer

        9. Whitespace Analyzer

        10. Custom Analyzer

      4. Normalizers

      5. Tokenizers

        1. Char Group Tokenizer

        2. Classic Tokenizer

        3. Edge NGram Tokenizer

        4. Keyword Tokenizer

        5. Letter Tokenizer

        6. Lowercase Tokenizer

        7. NGram Tokenizer

        8. Path Hierachy Tokenizer

        9. Pattern Tokenizer

        10. Simple Pattern Tokenizer

        11. Simple Pattern Split Tokenizer

        12. Standard Tokenizer

        13. Thai Tokenizer

        14. UAX URL Email Tokenizer

        15. Whitespace Tokenizer

      6. Token Filters

        1. Apostrophe

        2. ASCII Folding Token Filter

        3. CJK bigram

        4. CJK width

        5. Classic Token Filter

        6. Common Grams Token Filter

        7. Compound Word Token Filters

        8. Conditional Token Filter

        9. Decimal Digit Token Filter

        10. Delimited Payload Token Filter

        11. Edge NGram Token Filter

        12. Elision Token Filter

        13. Fingerprint Token Filter

        14. Flatten Graph Token Filter

        15. Hunspell Token Filter

        16. Keep Types Token Filter

        17. Keep Words Token Filter

        18. Keyword Request Token Filter

        19. KStem Token Filter

        20. Length Token Filter

        21. Limit Token Count Token Filter

        22. Lowercase Token Filter

        23. MinHash Token Filter

        24. Multiplexer Token Filter

        25. NGram Token Filter

        26. Normalization Token Filter

        27. Pattern Capture Token Filter

        28. Pattern Replace Token Filter

        29. Phonetic Token Filter

        30. Porter Stem Token Filter

        31. Predicate Token Filter Script

        32. Remove Duplicates Token Filter

        33. Reverse Token Filter

        34. Shingle Token Filter

        35. Snowball Token Filter

        36. Stemmer Token Filter

        37. Stemmer Override Token Filter

        38. Stop Token Filter

        39. Synonym Token Filter

        40. Synonym Graph Token Filter

        41. Trim Token Filter

        42. Truncate Token Filter

        43. Unique Token Filter

        44. Uppercase Token Filter

        45. Word Delimiter Token Filter

        46. Word Delimiter Graph Token Filter

      7. Character Filters

        1. HTML Strip Char Filter

        2. Mapping Char Filter

        3. Pattern Replace Char Filter

    11. Modules

      1. Discovery and cluster formation

        1. Discovery

        2. Quorum-based decision making

        3. Voting configurations

        4. Bootstrapping a cluster

        5. Adding and removing nodes

        6. Publishing the cluster state

        7. Cluster fault detection

        8. Discovery and cluster formation settings

      2. Shard allocation and cluster-level routing

        1. Cluster level shard allocation

        2. Disk-based shard allocation

        3. Shard allocation awareness

        4. Cluster-level shard allocation filtering

        5. Miscellaneous cluster settings

      3. Local Gateway

        1. Dangling indices

      4. HTTP

      5. Indices

        1. Circuit Breaker

        2. Fielddata

        3. Node Query Cache

        4. Indexing Buffer

        5. Shard request cache

        6. Index recovery

        7. Search Settings

      6. Network Settings

      7. Node

      8. Plugins

      9. Snapshot And Restore

      10. Thread Pool

      11. Transport

      12. Remote clusters

    12. Index modules

      1. Analysis

      2. Index Shard Allocation

        1. Index-level shard allocation filtering

        2. Delaying allocation when a node leaves

        3. Index recovery prioritization

        4. Total shards per node

      3. Mapper

      4. Merge

      5. Similarity module

      6. Slow Log

      7. Store

        1. Preloading data into the file system cache

      8. Translog

      9. History retention

      10. Index Sorting

        1. Use index sorting to speed up conjunctions

    13. Ingest node

      1. Pipeline Definition

      2. Accessing Data in Pipelines

      3. Conditional Execution in Pipelines

        1. Handling Nested Fields in Conditionals

        2. Complex Conditionals

        3. Conditionals with the Pipeline Processor

        4. Conditionals with the Regular Expressions

      4. Handling Failures in Pipelines

      5. Processors

        1. Append Processor

        2. Bytes Processor

        3. Circle Processor

        4. Convert Processor

        5. Date Processor

        6. Date Index Name Processor

        7. Dissect Processor

        8. Dot Expander Processor

        9. Drop Processor

        10. Fail Processor

        11. Foreach Processor

        12. GeoIP Processor

        13. Grok Processor

        14. Gsub Processor

        15. HTML Strip Processor

        16. Join Processor

        17. JSON Processor

        18. KV Processor

        19. Lowercase Processor

        20. Pipeline Processor

        21. Remove Processor

        22. Rename Processor

        23. Script Processor

        24. Set Processor

        25. Set Security User Processor

        26. Split Processor

        27. Sort Processor

        28. Trim Processor

        29. Uppercase Processor

        30. URL Decode Processor

        31. User Agent processor

    14. Managing the index lifecycle

      1. Getting started with index lifecycle management

      2. Policy phases and actions

        1. Timing

        2. Phase Execution

        3. Actions

        4. Full Policy

      3. Set up index lifecycle management policy

        1. Applying a policy to an index template

        2. Apply a policy to a create index request

      4. Using policies to manage index rollover

        1. Skipping Rollover

      5. Update policy

        1. Updates to policies not managing indices

        2. Updates to executing policies

        3. Switching policies for an index

      6. Index lifecycle error handling

      7. Restoring snapshots of managed indices

      8. Start and stop index lifecycle management

      9. Using ILM with existing indices

        1. Managing existing periodic indices with ILM

        2. Reindexing via ILM

      10. Getting started with snapshot lifecycle management

    15. SQL access

      1. Overview

      2. Getting Started with SQL

      3. Conventions and Terminology

        1. Mapping concepts across SQL and Elasticsearch

      4. Security

      5. SQL REST API

        1. Overview

        2. Response Data Formats

        3. Paginating through a large response

        4. Filtering using Elasticsearch query DSL

        5. Columnar results

        6. Supported REST parameters

      6. SQL Translate API

      7. SQL CLI

      8. SQL JDBC

        1. API usage

      9. SQL ODBC

        1. Driver installation

        2. Configuration

      10. SQL Client Applications

        1. DBeaver

        2. DbVisualizer

        3. Microsoft Excel

        4. Microsoft Power BI Desktop

        5. Microsoft PowerShell

        6. MicroStrategy Desktop

        7. Qlik Sense Desktop

        8. SQuirreL SQL

        9. SQL Workbench/J

        10. Tableau Desktop

      11. SQL Language

        1. Lexical Structure

        2. SQL Commands

        3. DESCRIBE TABLE

        4. SELECT

        5. SHOW COLUMNS

        6. SHOW FUNCTIONS

        7. SHOW TABLES

        8. Data Types

        9. Index patterns

        10. Frozen Indices

      12. Functions and Operators

        1. Comparison Operators

        2. Logical Operators

        3. Math Operators

        4. Cast Operators

        5. LIKE and RLIKE Operators

        6. Aggregate Functions

        7. Grouping Functions

        8. Date/Time and Interval Functions and Operators

        9. Full-Text Search Functions

        10. Mathematical Functions

        11. String Functions

        12. Type Conversion Functions

        13. Geo Functions

        14. Conditional Functions And Expressions

        15. System Functions

      13. Reserved keywords

      14. SQL Limitations

    16. Monitor a cluster

      1. Overview

      2. How it works

      3. Monitoring in a production environment

      4. Collecting monitoring data

        1. Pausing data collection

      5. Collecting monitoring data with Metricbeat

      6. Collecting log data with Filebeat

      7. Configuring indices for monitoring

      8. Collectors

      9. Exporters

        1. Local exporters

        2. HTTP exporters

      10. Troubleshooting

    17. Frozen indices

      1. Best practices

      2. Searching a frozen index

      3. Monitoring frozen indices

    18. Roll up or transform your data

      1. Rolling up historical data

        1. Overview

        2. API quick reference

        3. Getting started

        4. Understanding groups

        5. Rollup aggregation limitations

        6. Rollup search limitations

      2. Transforming data

        1. Overview

        2. When to use transforms

        3. How checkpoints work

        4. API quick reference

        5. Tutorial: Transforming the eCommerce sample data

        6. Examples

        7. Troubleshooting

        8. Limitations

    19. Set up a cluster for high availability

      1. Back up a cluster

        1. Back up the data

        2. Back up the cluster configuration

        3. Back up the security configuration

        4. Restore the security configuration

        5. Restore the data

      2. Cross-cluster replication

        1. Overview

        2. Requirements for leader indices

        3. Automatically following indices

        4. Getting started with cross-cluster replication

        5. Remote recovery

        6. Upgrading clusters

    20. Secure a cluster

      1. Overview

      2. Configuring security

        1. Encrypting communications in Elasticsearch

        2. Encrypting communications in an Elasticsearch Docker Container

        3. Enabling cipher suites for stronger encryption

        4. Separating node-to-node and client traffic

        5. Configuring an Active Directory realm

        6. Configuring a file realm

        7. Configuring an LDAP realm

        8. Configuring a native realm

        9. Configuring a PKI realm

        10. Configuring a SAML realm

        11. Configuring a Kerberos realm

        12. Security files

        13. FIPS 140-2

      3. How security works

      4. User authentication

        1. Built-in users

        2. Internal users

        3. Realms

        4. Realm chains

        5. Active Directory user authentication

        6. File-based user authentication

        7. LDAP user authentication

        8. Native user authentication

        9. PKI user authentication

        10. SAML authentication

        11. Kerberos authentication

        12. Integrating with other authentication systems

        13. Enabling anonymous access

        14. Controlling the user cache

      5. Configuring SAML single-sign-on on the Elastic Stack

        1. The identity provider

        2. Configure Elasticsearch for SAML authentication

        3. Generating SP metadata

        4. Configuring role mappings

        5. User metadata

        6. Configuring Kibana

        7. Troubleshooting SAML Realm Configuration

      6. Configuring single sign-on to the Elastic Stack using OpenID Connect

        1. The OpenID Connect Provider

        2. Configure Elasticsearch for OpenID Connect authentication

        3. Configuring role mappings

        4. User metadata

        5. Configuring Kibana

        6. OpenID Connect without Kibana

      7. User authorization

        1. Built-in roles

        2. Defining roles

        3. Security privileges

        4. Document level security

        5. Field level security

        6. Granting privileges for indices and aliases

        7. Mapping users and groups to roles

        8. Setting up field and document level security

        9. Submitting requests on behalf of other users

        10. Configuring authorization delegation

        11. Customizing roles and authorization

      8. Auditing security events

        1. Audit event types

        2. Logfile audit output

        3. Auditing search queries

      9. Encrypting communications

        1. Setting up TLS on a cluster

      10. Restricting connections with IP filtering

      11. Cross cluster search, clients, and integrations

        1. Cross cluster search and security

        2. Java Client and security

        3. HTTP/REST clients and security

        4. ES-Hadoop and Security

        5. Beats and Security

        6. Monitoring and security

      12. Tutorial: Getting started with security

        1. Enable Elasticsearch security features

        2. Create passwords for built-in users

        3. Add the built-in user to Kibana

        4. Configure authentication

        5. Create users

        6. Assign roles

        7. Add user information in Logstash

        8. View system metrics in Kibana

      13. Tutorial: Encrypting communications

        1. Generate certificates

        2. Encrypt internode communications

        3. Add nodes to your cluster

      14. Troubleshooting

        1. Some settings are not returned via the nodes settings API

        2. Authorization exceptions

        3. Users command fails due to extra arguments

        4. Users are frequently locked out of Active Directory

        5. Certificate verification fails for curl on Mac

        6. SSLHandshakeException causes connections to fail

        7. Common SSL/TLS exceptions

        8. Common Kerberos exceptions

        9. Common SAML issues

        10. Internal Server Error in Kibana

        11. Setup-passwords command fails due to connection failure

        12. Failures due to relocation of the configuration files

      15. Limitations

    21. Alerting on cluster and index events

      1. Getting started with Watcher

      2. How Watcher works

      3. Encrypting sensitive data in Watcher

      4. Inputs

        1. Simple input

        2. Search input

        3. HTTP input

        4. Chain input

      5. Triggers

        1. Schedule trigger

      6. Conditions

        1. Always condition

        2. Never condition

        3. Compare condition

        4. Array compare condition

        5. Script condition

      7. Actions

        1. Running an action for each element in an array

        2. Adding conditions to actions

        3. Email action

        4. Webhook action

        5. Index action

        6. Logging Action

        7. Slack Action

        8. PagerDuty action

        9. Jira action

      8. Transforms

        1. Search transform

        2. Script transform

        3. Chain transform

      9. Java API

      10. Managing watches

      11. Example watches

        1. Watching the status of an Elasticsearch cluster

        2. Watching event data

      12. Troubleshooting

      13. Limitations

    22. Command line tools

      1. elasticsearch-certgen

      2. elasticsearch-certutil

      3. elasticsearch-croneval

      4. elasticsearch-migrate

      5. elasticsearch-node

      6. elasticsearch-saml-metadata

      7. elasticsearch-setup-passwords

      8. elasticsearch-shard

      9. elasticsearch-syskeygen

      10. elasticsearch-users

    23. How To

      1. General recommendations

      2. Recipes

        1. Mixing exact search with stemming

        2. Getting consistent scoring

        3. Incorporating static relevance signals into the score

      3. Tune for indexing speed

      4. Tune for search speed

        1. Tune your queries with the Profile API

        2. Faster phrase queries with index_phrases

        3. Faster prefix queries with index_prefixes

      5. Tune for disk usage

    24. Testing

      1. Java Testing Framework

        1. Why randomized testing?

        2. Using the Elasticsearch test classes

        3. Unit tests

        4. Integration tests

        5. Randomized testing

        6. Assertions

    25. Glossary of terms

    26. REST APIs

      1. API conventions

        1. Multiple Indices

        2. Date math support in index names

        3. Common options

        4. URL-based access control

      2. cat APIs

        1. cat aliases

        2. cat allocation

        3. cat count

        4. cat fielddata

        5. cat health

        6. cat indices

        7. cat master

        8. cat nodeattrs

        9. cat nodes

        10. cat pending tasks

        11. cat plugins

        12. cat recovery

        13. cat repositories

        14. cat task management

        15. cat thread pool

        16. cat shards

        17. cat segments

        18. cat snapshots

        19. cat templates

      3. Cluster APIs

        1. Cluster Health

        2. Cluster State

        3. Cluster Stats

        4. Pending cluster tasks

        5. Cluster Reroute

        6. Cluster Update Settings

        7. Cluster Get Settings

        8. Nodes Stats

        9. Nodes Info

        10. Nodes Feature Usage

        11. Remote Cluster Info

        12. Task management

        13. Nodes hot_threads

        14. Cluster Allocation Explain API

        15. Voting Configuration Exclusions

      4. Cross-cluster replication APIs

        1. Get CCR stats

        2. Create follower

        3. Pause follower

        4. Resume follower

        5. Unfollow

        6. Forget follower

        7. Get follower stats

        8. Get follower info

        9. Create auto-follow pattern

        10. Delete auto-follow pattern

        11. Get auto-follow pattern

      5. Document APIs

        1. Reading and Writing documents

        2. Index

        3. Get

        4. Delete

        5. Delete by query

        6. Update

        7. Update By Query API

        8. Multi get

        9. Bulk

        10. Reindex

        11. Term vectors

        12. Multi term vectors

        13. ?refresh

        14. Optimistic concurrency control

      6. Explore API

      7. Index APIs

        1. Add index alias

        2. Analyze

        3. Clear cache

        4. Clone index

        5. Close index

        6. Create index

        7. Delete index

        8. Delete index alias

        9. Delete index template

        10. Flush

        11. Force merge

        12. Freeze index

        13. Get field mapping

        14. Get index

        15. Get index alias

        16. Get index settings

        17. Get index template

        18. Get mapping

        19. Index alias exists

        20. Index exists

        21. Index recovery

        22. Index segments

        23. Index shard stores

        24. Index stats

        25. Index template exists

        26. Open index

        27. Put index template

        28. Put mapping

        29. Refresh

        30. Rollover index

        31. Shrink index

        32. Split index

        33. Synced flush

        34. Type exists

        35. Unfreeze index

        36. Update index alias

        37. Update index settings

      8. Index lifecycle management API

        1. Create policy

        2. Get policy

        3. Delete policy

        4. Move to step

        5. Remove policy

        6. Retry policy

        7. Get index lifecycle management status

        8. Explain lifecycle

        9. Start index lifecycle management

        10. Stop index lifecycle management

      9. Ingest APIs

        1. Put pipeline

        2. Get pipeline

        3. Delete pipeline

        4. Simulate pipeline

      10. Info API

      11. Licensing APIs

        1. Delete license

        2. Get license

        3. Get trial status

        4. Start trial

        5. Get basic status

        6. Start basic

        7. Update license

      12. Machine learning anomaly detection APIs

        1. Add events to calendar

        2. Add jobs to calendar

        3. Close jobs

        4. Create jobs

        5. Create calendar

        6. Create datafeeds

        7. Create filter

        8. Delete calendar

        9. Delete datafeeds

        10. Delete events from calendar

        11. Delete filter

        12. Delete forecast

        13. Delete jobs

        14. Delete jobs from calendar

        15. Delete model snapshots

        16. Delete expired data

        17. Find file structure

        18. Flush jobs

        19. Forecast jobs

        20. Get buckets

        21. Get calendars

        22. Get categories

        23. Get datafeeds

        24. Get datafeed statistics

        25. Get influencers

        26. Get jobs

        27. Get job statistics

        28. Get machine learning info

        29. Get model snapshots

        30. Get overall buckets

        31. Get scheduled events

        32. Get filters

        33. Get records

        34. Open jobs

        35. Post data to jobs

        36. Preview datafeeds

        37. Revert model snapshots

        38. Set upgrade mode

        39. Start datafeeds

        40. Stop datafeeds

        41. Update datafeeds

        42. Update filter

        43. Update jobs

        44. Update model snapshots

      13. Machine learning data frame analytics APIs

        1. Create data frame analytics jobs

        2. Delete data frame analytics jobs

        3. Evaluate data frame analytics

        4. Estimate memory usage for data frame analytics jobs

        5. Get data frame analytics jobs

        6. Get data frame analytics jobs stats

        7. Start data frame analytics jobs

        8. Stop data frame analytics jobs

      14. Migration APIs

        1. Deprecation info

      15. Reload search analyzers

      16. Rollup APIs

        1. Create rollup jobs

        2. Delete rollup jobs

        3. Get job

        4. Get rollup caps

        5. Get rollup index caps

        6. Rollup search

        7. Rollup job configuration

        8. Start rollup jobs

        9. Stop rollup jobs

      17. Search APIs

        1. Search

        2. URI Search

        3. Request Body Search

        4. Search Template

        5. Multi Search Template

        6. Search Shards API

        7. Suggesters

        8. Multi Search API

        9. Count API

        10. Validate API

        11. Explain API

        12. Profile API

        13. Field Capabilities API

        14. Ranking Evaluation API

      18. Security APIs

        1. Authenticate

        2. Change passwords

        3. Clear cache

        4. Clear roles cache

        5. Create API keys

        6. Create or update application privileges

        7. Create or update role mappings

        8. Create or update roles

        9. Create or update users

        10. Delegate PKI authentication

        11. Delete application privileges

        12. Delete role mappings

        13. Delete roles

        14. Delete users

        15. Disable users

        16. Enable users

        17. Get API key information

        18. Get application privileges

        19. Get builtin privileges

        20. Get role mappings

        21. Get roles

        22. Get token

        23. Get users

        24. Has privileges

        25. Invalidate API key

        26. Invalidate token

        27. OpenID Connect Prepare Authentication API

        28. OpenID Connect authenticate API

        29. OpenID Connect logout API

        30. SSL certificate

      19. Snapshot lifecycle management API

        1. Put snapshot lifecycle policy

        2. Get snapshot lifecycle policy

        3. Execute snapshot lifecycle policy

        4. Delete snapshot lifecycle policy

      20. Transform APIs

        1. Create transforms

        2. Update transforms

        3. Delete transforms

        4. Get transforms

        5. Get transform statistics

        6. Preview transforms

        7. Start transforms

        8. Stop transforms

      21. Watcher APIs

        1. Ack watch

        2. Activate watch

        3. Deactivate watch

        4. Delete watch

        5. Execute watch

        6. Get watch

        7. Get Watcher stats

        8. Put watch

        9. Start watch service

        10. Stop watch service

      22. Definitions

        1. Datafeed resources

        2. Data frame analytics job resources

        3. Data frame analytics evaluation resources

        4. Job resources

        5. Job statistics

        6. Model snapshot resources

        7. Role mapping resources

        8. Results resources

        9. Transform resources

  2. Logstash

    1. Logstash Introduction

    2. Getting Started with Logstash

      1. Installing Logstash

      2. Stashing Your First Event

      3. Parsing Logs with Logstash

      4. Stitching Together Multiple Input and Output Plugins

    3. How Logstash Works

      1. Execution Model

    4. Setting Up and Running Logstash

      1. Logstash Directory Layout

      2. Logstash Configuration Files

      3. logstash.yml

      4. Secrets keystore for secure settings

      5. Running Logstash from the Command Line

      6. Running Logstash as a Service on Debian or RPM

      7. Running Logstash on Docker

      8. Configuring Logstash for Docker

      9. Running Logstash on Windows

      10. Logging

      11. Shutting Down Logstash

      12. Setting Up X-Pack

    5. Upgrading Logstash

      1. Upgrading Using Package Managers

      2. Upgrading Using a Direct Download

      3. Upgrading between minor versions

      4. Upgrading Logstash to 7.0

      5. Upgrading with the Persistent Queue Enabled

    6. Configuring Logstash

      1. Structure of a Config File

      2. Accessing Event Data and Fields in the Configuration

      3. Using Environment Variables in the Configuration

      4. Logstash Configuration Examples

      5. Multiple Pipelines

      6. Pipeline-to-Pipeline Communication

      7. Reloading the Config File

      8. Managing Multiline Events

      9. Glob Pattern Support

      10. Converting Ingest Node Pipelines

      11. Logstash-to-Logstash Communication

      12. Centralized Pipeline Management

      13. X-Pack security

      14. X-Pack Settings

    7. Managing Logstash

      1. Centralized Pipeline Management

    8. Working with Logstash Modules

      1. Using Elastic Cloud

      2. ArcSight Module

      3. Netflow Module (deprecated)

      4. Azure Module

    9. Working with Filebeat Modules

      1. Use ingest pipelines for parsing

      2. Use Logstash pipelines for parsing

      3. Example: Set up Filebeat modules to work with Kafka and Logstash

    10. Data Resiliency

      1. Persistent Queues

      2. Dead Letter Queues

    11. Transforming Data

      1. Performing Core Operations

      2. Deserializing Data

      3. Extracting Fields and Wrangling Data

      4. Enriching Data with Lookups

    12. Deploying and Scaling Logstash

    13. Performance Tuning

      1. Performance Troubleshooting Guide

      2. Tuning and Profiling Logstash Performance

    14. Monitoring Logstash with APIs

      1. Node Info API

      2. Plugins Info API

      3. Node Stats API

      4. Hot Threads API

    15. Monitoring Logstash with X-Pack

      1. Metricbeat collection

      2. Internal collection

      3. Monitoring UI

      4. Pipeline Viewer UI

      5. Troubleshooting

    16. Working with plugins

      1. Generating Plugins

      2. Offline Plugin Management

      3. Private Gem Repositories

      4. Event API

    17. Input plugins

      1. azure_event_hubs

      2. beats

      3. cloudwatch

      4. couchdb_changes

      5. dead_letter_queue

      6. elasticsearch

      7. exec

      8. file

      9. ganglia

      10. gelf

      11. generator

      12. github

      13. google_cloud_storage

      14. google_pubsub

      15. graphite

      16. heartbeat

      17. http

      18. http_poller

      19. imap

      20. irc

      21. java_generator

      22. java_stdin

      23. jdbc

      24. jms

      25. jmx

      26. kafka

      27. kinesis

      28. log4j

      29. lumberjack

      30. meetup

      31. pipe

      32. puppet_facter

      33. rabbitmq

      34. redis

      35. relp

      36. rss

      37. s3

      38. salesforce

      39. snmp

      40. snmptrap

      41. sqlite

      42. sqs

      43. stdin

      44. stomp

      45. syslog

      46. tcp

      47. twitter

      48. udp

      49. unix

      50. varnishlog

      51. websocket

      52. wmi

      53. xmpp

    18. Output plugins

      1. boundary

      2. circonus

      3. cloudwatch

      4. csv

      5. datadog

      6. datadog_metrics

      7. elastic_app_search

      8. elasticsearch

      9. email

      10. exec

      11. file

      12. ganglia

      13. gelf

      14. google_bigquery

      15. google_cloud_storage

      16. google_pubsub

      17. graphite

      18. graphtastic

      19. http

      20. influxdb

      21. irc

      22. java_sink

      23. java_stdout

      24. juggernaut

      25. kafka

      26. librato

      27. loggly

      28. lumberjack

      29. metriccatcher

      30. mongodb

      31. nagios

      32. nagios_nsca

      33. opentsdb

      34. pagerduty

      35. pipe

      36. rabbitmq

      37. redis

      38. redmine

      39. riak

      40. riemann

      41. s3

      42. sns

      43. solr_http

      44. sqs

      45. statsd

      46. stdout

      47. stomp

      48. syslog

      49. tcp

      50. timber

      51. udp

      52. webhdfs

      53. websocket

      54. xmpp

      55. zabbix

    19. Filter plugins

      1. aggregate

      2. alter

      3. bytes

      4. cidr

      5. cipher

      6. clone

      7. csv

      8. date

      9. de_dot

      10. dissect

      11. dns

      12. drop

      13. elapsed

      14. elasticsearch

      15. environment

      16. extractnumbers

      17. fingerprint

      18. geoip

      19. grok

      20. http

      21. i18n

      22. java_uuid

      23. jdbc_static

      24. jdbc_streaming

      25. json

      26. json_encode

      27. kv

      28. memcached

      29. metricize

      30. metrics

      31. mutate

      32. prune

      33. range

      34. ruby

      35. sleep

      36. split

      37. syslog_pri

      38. threats_classifier

      39. throttle

      40. tld

      41. translate

      42. truncate

      43. urldecode

      44. useragent

      45. uuid

      46. xml

    20. Codec plugins

      1. avro

      2. cef

      3. cloudfront

      4. cloudtrail

      5. collectd

      6. dots

      7. edn

      8. edn_lines

      9. es_bulk

      10. fluent

      11. graphite

      12. gzip_lines

      13. jdots

      14. java_line

      15. java_plain

      16. json

      17. json_lines

      18. line

      19. msgpack

      20. multiline

      21. netflow

      22. nmap

      23. plain

      24. protobuf

      25. rubydebug

    21. Tips and Best Practices

    22. Troubleshooting Common Problems

    23. Contributing to Logstash

      1. How to write a Logstash input plugin

      2. How to write a Logstash codec plugin

      3. How to write a Logstash filter plugin

      4. How to write a Logstash output plugin

      5. Documenting your plugin

      6. Contributing a Patch to a Logstash Plugin

      7. Logstash Plugins Community Maintainer Guide

      8. Submitting your plugin to RubyGems.org and the logstash-plugins repository

    24. Contributing a Java Plugin

      1. How to write a Java input plugin

      2. How to write a Java codec plugin

      3. How to write a Java filter plugin

      4. How to write a Java output plugin

    25. Glossary of Terms

  3. Kibana

    1. Introduction

    2. Set Up Kibana

      1. Installing Kibana

      2. Install Kibana with .tar.gz

      3. Install Kibana with Debian Package

      4. Install Kibana with RPM

      5. Install Kibana on Windows

      6. Install Kibana on macOS with Homebrew

    3. Starting and stopping Kibana

    4. Configuring Kibana

      1. APM settings

      2. Code settings

      3. Development tools settings

      4. Graph settings

      5. Infrastructure UI settings

      6. i18n settings in Kibana

      7. Logs UI settings

      8. Machine learning settings

      9. Monitoring settings

      10. Reporting settings

      11. Secure settings

      12. Security settings

      13. Spaces settings

    5. Running Kibana on Docker

    6. Accessing Kibana

    7. Connect Kibana with Elasticsearch

    8. Using Kibana in a production environment

    9. Upgrading Kibana

      1. Standard upgrade

      2. Troubleshooting saved object migrations

    10. Configuring monitoring

      1. Collecting monitoring data

      2. Collecting monitoring data with Metricbeat

      3. Viewing monitoring data

    11. Configuring security

      1. Authentication

      2. Encrypting communications

      3. Audit Logging

    12. Getting Started

      1. Add sample data

      2. Explore Kibana using sample data

      3. Build your own dashboard

        1. Define your index patterns

        2. Discover your data

        3. Visualize your data

        4. Add visualizations to a dashboard

    13. Discover

      1. Setting the time filter

      2. Searching your data

        1. Kibana Query Language

        2. Lucene query syntax

        3. Saving searches

        4. Saving queries

        5. Change the indices you’re searching

        6. Refresh the search results

      3. Filtering by Field

      4. Viewing Document Data

      5. Viewing Document Context

      6. Viewing Field Data Statistics

    14. Visualize

      1. Creating a Visualization

      2. Saving Visualizations

      3. Using rolled up data in a visualization

      4. Line, Area, and Bar charts

      5. Controls Visualization

        1. Adding Input Controls

        2. Global Options

      6. Data Table

      7. Markdown Widget

      8. Metric

      9. Goal and Gauge

      10. Pie Charts

      11. Coordinate Maps

      12. Region Maps

      13. Timelion

      14. TSVB

      15. Tag Clouds

      16. Heatmap Chart

      17. Vega Graphs

        1. Getting Started with Vega

        2. Vega vs Vega-Lite

        3. Querying Elasticsearch

        4. Elastic Map Files

        5. Vega with a Map

        6. Debugging

        7. Useful Links

      18. Inspecting Visualizations

    15. Dashboard

      1. Create a dashboard

      2. Dashboard-only mode

    16. Canvas

      1. Canvas tutorial

      2. Create a workpad

      3. Showcase your data with elements

      4. Present your workpad

      5. Share your workpad

      6. Canvas function reference

        1. TinyMath functions

    17. Extend your use case

      1. Graph data connections

        1. Using Graph

        2. Configuring Graph

        3. Troubleshooting

        4. Limitations

      2. Machine learning

    18. Elastic Maps

      1. Getting started with Elastic Maps

        1. Creating a new map

        2. Adding a choropleth layer

        3. Adding layers for Elasticsearch data

        4. Saving the map

        5. Adding the map to a dashboard

      2. Heat map layer

      3. Tile layer

      4. Vector layer

        1. Vector styling

        2. Vector style properties

        3. Vector tooltips

      5. Plot big data without plotting too much data

        1. Grid aggregation

        2. Most recent entities

        3. Point to point

        4. Term join

      6. Searching your data

        1. Creating filters from your map

        2. Filtering a single layer

        3. Searching across multiple indices

      7. Connecting to Elastic Maps Service

      8. Upload GeoJSON data

      9. Indexing GeoJSON data tutorial

      10. Elastic Maps troubleshooting

    19. Code

      1. Import your first repo

      2. Repo management

      3. Install language server

      4. Basic navigation

      5. Semantic code navigation

      6. Search

      7. Config for multiple Kibana instances

    20. Infrastructure

      1. Getting started with infrastructure monitoring

      2. Using the Infrastructure app

      3. Viewing infrastructure metrics

      4. Metrics Explorer

    21. Logs

      1. Getting started with logs monitoring

      2. Using the Logs app

      3. Configuring the Logs data

    22. APM

      1. Getting Started

      2. Visualizing Application Bottlenecks

      3. Using APM

        1. Filters

        2. Services overview

        3. Traces overview

        4. Transaction overview

        5. Span timeline

        6. Errors overview

        7. Metrics overview

        8. Machine Learning integration

        9. APM Agent configuration

        10. Advanced queries

    23. Uptime

      1. Overview

      2. Monitor

    24. SIEM

      1. Using the SIEM UI

      2. Anomaly Detection with Machine Learning

    25. Dev Tools

      1. Console

      2. Profiling queries and aggregations

        1. Getting Started

        2. Profiling a more complicated query

        3. Rendering pre-captured profiler JSON

      3. Debugging grok expressions

    26. Stack Monitoring

      1. Beats Metrics

      2. Cluster Alerts

      3. Elasticsearch Metrics

      4. Kibana Metrics

      5. Logstash Metrics

      6. Troubleshooting

    27. Management

      1. License Management

      2. Index patterns

        1. Cross-cluster search

      3. Rollup jobs

      4. Index lifecycle policies

        1. Creating an index lifecycle policy

        2. Managing index lifecycle policies

        3. Adding a policy to an index

        4. Example of using an index lifecycle policy

      5. Managing Fields

        1. String Field Formatters

        2. Date Field Formatters

        3. Geographic Point Field Formatters

        4. Numeric Field Formatters

        5. Scripted Fields

      6. Index management

      7. Setting advanced options

      8. Saved objects

      9. Managing Beats

      10. Working with remote clusters

      11. Snapshot and Restore

      12. Spaces

      13. Security

        1. Granting access to Kibana

        2. Kibana role management

        3. Kibana privileges

      14. Watcher

      15. Upgrade Assistant

    28. Reporting from Kibana

      1. Automating report generation

      2. PDF layout modes

      3. Reporting configuration

        1. Reporting and security

        2. Secure the reporting endpoints

        3. Chromium sandbox

      4. Troubleshooting

      5. Reporting integration

    29. REST API

      1. Features API

        1. Get features

      2. Kibana Spaces APIs

        1. Create space

        2. Update space

        3. Get space

        4. Get all spaces

        5. Delete space

        6. Copy saved objects to space

        7. Resolve copy to space conflicts

      3. Kibana role management APIs

        1. Create or update role

        2. Get specific role

        3. Get all roles

        4. Delete role

      4. Saved objects APIs

        1. Get object

        2. Bulk get objects

        3. Find objects

        4. Create object

        5. Bulk create objects

        6. Update object

        7. Delete object

        8. Export objects

        9. Import objects

        10. Resolve import errors

      5. Dashboard import and export APIs

        1. Import dashboard

        2. Dashboard export

      6. Logstash configuration management APIs

        1. Create pipeline

        2. Retrieve pipeline

        3. Delete pipeline

        4. List pipeline

      7. URL shortening API

        1. Shorten URL

      8. Upgrade assistant APIs

        1. Upgrade readiness status

        2. Start or resume reindex

        3. Check reindex status

        4. Cancel reindex

    30. Kibana plugins

      1. Install plugins

      2. Update and remove plugins

      3. Disable plugins

      4. Configure the plugin manager

      5. Known Plugins

    31. Limitations

      1. Nested Objects

      2. Exporting data

    32. Developer guide

      1. Core Development

        1. Considerations for basePath

        2. Managing Dependencies

        3. Modules and Autoloading

        4. Communicating with Elasticsearch

        5. Unit Testing

        6. Functional Testing

      2. Plugin Development

        1. Plugin Resources

        2. UI Exports

        3. Plugin feature registration

        4. Functional Tests for Plugins

        5. Localization for plugins

      3. Developing Visualizations

        1. Embedding Visualizations

        2. Developing Visualizations

        3. Visualization Factory

        4. Visualization Editors

        5. Visualization Request Handlers

        6. Visualization Response Handlers

        7. Vis object

        8. AggConfig object

      4. Add Data Guide

      5. Security

        1. Role-based access control

      6. Pull request review guidelines

      7. Interpreting CI Failures

  4. Beats Platform

    1. Community Beats

    2. Getting started with Beats

    3. Config file format

      1. Namespacing

      2. Config file data types

      3. Environment variables

      4. Reference variables

      5. Config file ownership and permissions

      6. Command line arguments

      7. YAML tips and gotchas

    4. Upgrading

      1. Upgrade between minor versions

      2. Upgrade from 6.x to 7.x

      3. Troubleshooting Beats upgrade issues

  5. Beats Developer Guide

    1. Contributing to Beats

    2. Community Beats

    3. Creating a New Beat

      1. Getting Ready

      2. Overview

      3. Generating Your Beat

      4. Fetching Dependencies and Setting up the Beat

      5. Building and Running the Beat

      6. The Beater Interface

      7. Sharing Your Beat with the Community

      8. Naming Conventions

    4. Creating New Kibana Dashboards

      1. Importing Existing Beat Dashboards

      2. Building Your Own Beat Dashboards

      3. Generating the Beat Index Pattern

      4. Exporting New and Modified Beat Dashboards

      5. Archiving Your Beat Dashboards

      6. Sharing Your Beat Dashboards

    5. Adding a New Protocol to Packetbeat

      1. Getting Ready

      2. Protocol Modules

      3. Testing

    6. Extending Metricbeat

      1. Overview

      2. Creating a Metricset

      3. Metricset Details

      4. Creating a Metricbeat Module

      5. Creating a Beat based on Metricbeat

      6. Metricbeat Developer FAQ

    7. Creating a New Filebeat Module

    8. Migrating dashboards from Kibana 5.x to 6.x

  6. Filebeat

    1. Overview

    2. Getting Started With Filebeat

      1. Step 1: Install Filebeat

      2. Step 2: Configure Filebeat

      3. Step 3: Load the index template in Elasticsearch

      4. Step 4: Set up the Kibana dashboards

      5. Step 5: Start Filebeat

      6. Step 6: View the sample Kibana dashboards

      7. Quick start: modules for common log formats

      8. Repositories for APT and YUM

    3. Setting up and running Filebeat

      1. Directory layout

      2. Secrets keystore

      3. Command reference

      4. Running Filebeat on Docker

      5. Running Filebeat on Kubernetes

      6. Filebeat and systemd

      7. Stopping Filebeat

    4. Upgrading Filebeat

    5. How Filebeat works

    6. Configuring Filebeat

      1. Specify which modules to run

      2. Configure inputs

      3. Manage multiline messages

      4. Specify general settings

      5. Load external configuration files

      6. Configure the internal queue

      7. Configure the output

      8. Configure index lifecycle management

      9. Load balance the output hosts

      10. Specify SSL settings

      11. Filter and enhance the exported data

      12. Parse data by using ingest node

      13. Enrich events with geoIP information

      14. Configure project paths

      15. Configure the Kibana endpoint

      16. Load the Kibana dashboards

      17. Load the Elasticsearch index template

      18. Configure logging

      19. Use environment variables in the configuration

      20. Autodiscover

      21. YAML tips and gotchas

      22. Regular expression support

      23. HTTP Endpoint

      24. filebeat.reference.yml

    7. Beats central management

      1. How central management works

      2. Enroll Beats in central management

    8. Modules

      1. Modules overview

      2. Apache module

      3. Auditd module

      4. AWS module

      5. CEF module

      6. Cisco module

      7. Coredns Module

      8. Elasticsearch module

      9. Envoyproxy Module

      10. Google Cloud module

      11. haproxy module

      12. IBM MQ module

      13. Icinga module

      14. IIS module

      15. Iptables module

      16. Kafka module

      17. Kibana module

      18. Logstash module

      19. MongoDB module

      20. MSSQL module

      21. MySQL module

      22. nats module

      23. NetFlow module

      24. Nginx module

      25. Osquery module

      26. Palo Alto Networks module

      27. PostgreSQL module

      28. RabbitMQ module

      29. Redis module

      30. Santa module

      31. Suricata module

      32. System module

      33. Traefik module

      34. Zeek (Bro) Module

    9. Exported fields

      1. Apache fields

      2. Auditd fields

      3. AWS fields

      4. Beat fields

      5. Decode CEF processor fields fields

      6. CEF fields

      7. Cisco fields

      8. Cloud provider metadata fields

      9. Coredns fields

      10. Docker fields

      11. ECS fields

      12. elasticsearch fields

      13. Envoyproxy fields

      14. Google Cloud fields

      15. haproxy fields

      16. Host fields

      17. ibmmq fields

      18. Icinga fields

      19. IIS fields

      20. iptables fields

      21. Jolokia Discovery autodiscover provider fields

      22. Kafka fields

      23. kibana fields

      24. Kubernetes fields

      25. Log file content fields

      26. logstash fields

      27. mongodb fields

      28. mssql fields

      29. MySQL fields

      30. nats fields

      31. NetFlow fields

      32. NetFlow fields

      33. Nginx fields

      34. Osquery fields

      35. panw fields

      36. PostgreSQL fields

      37. Process fields

      38. RabbitMQ fields

      39. Redis fields

      40. s3 fields

      41. Google Santa fields

      42. Suricata fields

      43. System fields

      44. Traefik fields

      45. Zeek fields

    10. Monitoring Filebeat

      1. Internal collection

        1. Settings for internal monitoring collection

      2. Metricbeat collection

    11. Securing Filebeat

      1. Secure communication with Elasticsearch

      2. Secure communication with Logstash

      3. Use X-Pack security

        1. Grant users access to secured resources

        2. Configure authentication credentials

        3. Configure Filebeat to use encrypted connections

      4. Use Linux Secure Computing Mode (seccomp)

    12. Troubleshooting

      1. Get help

      2. Debug

      3. Common problems

        1. Can’t read log files from network volumes

        2. Filebeat isn’t collecting lines from a file

        3. Too many open file handlers

        4. Registry file is too large

        5. Inode reuse causes Filebeat to skip lines

        6. Log rotation results in lost or duplicate events

        7. Open file handlers cause issues with Windows file rotation

        8. Filebeat is using too much CPU

        9. Dashboard in Kibana is breaking up data fields incorrectly

        10. Fields are not indexed or usable in Kibana visualizations

        11. Filebeat isn’t shipping the last line of a file

        12. Filebeat keeps open file handlers of deleted files for a long time

        13. Filebeat uses too much bandwidth

        14. Error loading config file

        15. Found unexpected or unknown characters

        16. Logstash connection doesn’t work

        17. @metadata is missing in Logstash

        18. Not sure whether to use Logstash or Beats

        19. SSL client fails to connect to Logstash

        20. Monitoring UI shows fewer Beats than expected

    13. A. Contributing to Beats

 

Trackbacks 0 : Comments 0

Write a comment


[Logstash] JSON filter plugin

Elastic/Logstash 2019. 11. 4. 14:16

공홈에 올라와 있는 문서의 번역 본 정도로 정리를 해보려고 합니다.

별거 아니지만 JSON filter 를 많이 사용하면서 Validation 에 대한 인식이 부족해서 오류를 발생 시키는 경우가 꽤 많이 있습니다.

기억력을 돕기 위해 작성해 봅니다.

 

공식문서)

https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html

 

이 내용은 logstash reference 문서 내 filter 항목에 해당 합니다.

용도는 말 그대로 입니다. JSON  parsing 을 하는 filter 입니다.

 

여기서 문제가 들어 오는 JSON 데이터가 항상 validate 할 거라고 생각 하고 구현 하시는 분들이 계시는데 이 부분이 문제가 됩니다.

기본 이라고 생각 하지만 validation 에 대한 개발자의 생각과 경험이 요즘은 다른 것 같더라구요.

 

암튼, 그래서 JSON filter 사용 시 제공하는 Option 에 대해서 숙지 하고 사용하시면 좋겠습니다.

기본적으로는 Common Options 를 먼저 보시는게 좋습니다.

 

JSON Filter Configuration Options)

Setting Input type Required
skip_on_invalid_json boolean No
source string Yes
tag_on_failure array No
target string No
  • skip_on_invalid_json
    json 이 아닌 데이터가 들어 올 경우 에러를 발생 시키지 않고 skip 시키기 위한 옵션 입니다.
    기본 설정 값이 false 이기 때문에 잘 못된 데이터에 대해서 오류가 발생 하게 됩니다.

  • source
    json parsing 을 하기 위한 field 를 지정 하게 됩니다.

  • tag_on_failure
    정상적으로 처리 되지 않았을 경우 tags 라는 filed 에 "_jsonparsefailure" 값이 추가 됩니다.

  •  target
    source field 내 json value 가 target filed 로 등록 되며, 이미 target field 가 있다면 overwrite 됩니다.

위 설정 중에서 skip_on_invalid_json  과 tag_on_failure 만 잘 설정 하셔도 invalid data 에 대한 오류는 잘 넘길 수 있습니다.

간혹 이 오류로 인해서 logstash 가 먹통이 되는 걸 예방 할 수 있기 때문 입니다.

Trackbacks 0 : Comments 0

Write a comment


[Elasticsearch] Cache 에 대해 알아 봅시다.

Elastic/Elasticsearch 2019. 10. 23. 09:58

저는 기본적으로 API 단에서 Elasticsearch 로 질의한 결과를 Cache 하도록 구현해서 사용하고 있습니다.

하지만 Elasticsearch 에서도 기본적으로 두 가지의 Cache 기능을 제공 하고 있으니 잘 활용 하시면 좋을 것 같아 기록해 봅니다.

 

한 줄로 정리 하면)

검색 결과 리스팅은 Query Cache에, 검색 결과에 대한 집계 는 Request Cache 에 저장 된다고 이해 하시면 됩니다.

 

1. Node Query Cache

공식문서)

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-cache.html

- 이 기능은 Query 에 따른 결과를 Cache하게 되며, LRU 정책으로 동작 합니다.

- 이 기능은 Node 레벨로 동작 합니다.

- 이 기능은 Filter Context 를 사용 했을 경우에만 동작 합니다.

 

- 아래 설정은 Cluster 내 모든 Data Node 에 설정을 반드시 해야 합니다.

indices.queries.cache.size

 

- 아래 설정은 Index 별로 설정을 해야 합니다.
index.queries.cache.enabled

 

2. Shard Request Cache

공식문서)

https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-request-cache.html

- 이 기능은 개별 Local Shard 의 결과를 Cache 합니다.

- 이 기능은 size=0 인 Request 의 결과만 Cache 합니다.

- 즉, Aggregations 와 Suggestions 결과를 Cache 하게 되며, hits 결과는 Cache 하지 않지만 hits.total 은 Cache 합니다.

- Date Range 또는 Histogram 질의 시 now 를 사용하게 되면 Cache 하지 않습니다.

 

- 이 기능은 Index 레벨로 설정을 합니다.

PUT /my_index
{
  "settings": {
    "index.requests.cache.enable": true
  }
}

 

- 아래 설정은 Node 레벨로 설정을 하는 것입니다.

indices.requests.cache.size

 

- 아래 설정은 Cache TTL 설정을 하는 것입니다. 

indices.requests.cache.expire

 

Code Sniff)

// IndicesRequestCache.java
    /**
     * A setting to enable or disable request caching on an index level. Its dynamic by default
     * since we are checking on the cluster state IndexMetaData always.
     */
    public static final Setting<Boolean> INDEX_CACHE_REQUEST_ENABLED_SETTING =
        Setting.boolSetting("index.requests.cache.enable", true, Property.Dynamic, Property.IndexScope);
    public static final Setting<ByteSizeValue> INDICES_CACHE_QUERY_SIZE =
        Setting.memorySizeSetting("indices.requests.cache.size", "1%", Property.NodeScope);
    public static final Setting<TimeValue> INDICES_CACHE_QUERY_EXPIRE =
        Setting.positiveTimeSetting("indices.requests.cache.expire", new TimeValue(0), Property.NodeScope);
        
// TimeValue.java        
   public static TimeValue parseTimeValue(String sValue, TimeValue defaultValue, String settingName) {
        settingName = Objects.requireNonNull(settingName);
        if (sValue == null) {
            return defaultValue;
        }
        final String normalized = sValue.toLowerCase(Locale.ROOT).trim();
        if (normalized.endsWith("nanos")) {
            return new TimeValue(parse(sValue, normalized, "nanos"), TimeUnit.NANOSECONDS);
        } else if (normalized.endsWith("micros")) {
            return new TimeValue(parse(sValue, normalized, "micros"), TimeUnit.MICROSECONDS);
        } else if (normalized.endsWith("ms")) {
            return new TimeValue(parse(sValue, normalized, "ms"), TimeUnit.MILLISECONDS);
        } else if (normalized.endsWith("s")) {
            return new TimeValue(parse(sValue, normalized, "s"), TimeUnit.SECONDS);
        } else if (sValue.endsWith("m")) {
            // parsing minutes should be case-sensitive as 'M' means "months", not "minutes"; this is the only special case.
            return new TimeValue(parse(sValue, normalized, "m"), TimeUnit.MINUTES);
        } else if (normalized.endsWith("h")) {
            return new TimeValue(parse(sValue, normalized, "h"), TimeUnit.HOURS);
        } else if (normalized.endsWith("d")) {
            return new TimeValue(parse(sValue, normalized, "d"), TimeUnit.DAYS);
        } else if (normalized.matches("-0*1")) {
            return TimeValue.MINUS_ONE;
        } else if (normalized.matches("0+")) {
            return TimeValue.ZERO;
        } else {
            // Missing units:
            throw new IllegalArgumentException("failed to parse setting [" + settingName + "] with value [" + sValue +
                    "] as a time value: unit is missing or unrecognized");
        }
    }        

 

각 설정 값들에 대한 최적화는

- 장비 스펙

- 질의 특성

- 문서 크기

등에 맞춰서 구성을 하셔야 합니다.

잘 모를 경우는 그냥 Elasticsearch 의 default 값을 사용하시면서 최적값을 찾으셔야 합니다.

 

Monitoring Cache Usage)

공식문서)

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-stats.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-stats.html

GET /_stats/request_cache?human
GET /_nodes/stats/indices/request_cache?human

 

함께 알아 두면 좋은것)

https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/doc-values.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-store.html

 

Trackbacks 0 : Comments 0

Write a comment


[Elasticsearch] 앱 내 사용자 행동로그 수집 파이프라인 구성

Elastic/Elasticsearch 2019. 10. 17. 15:13

사용하고자 하는 Software Stack 은 다양하게 많이 있습니다.

일반적으로 아래 파이프라인으로 많이들 구성 합니다.

 

1. App -> Stream service -> Consumer -> Elasticsearch
2. App -> Stream service -> Producer -> Queue -> Consumer -> Elasticsearch
3. App -> Logging service (daemon, http, file ...) -> Consumer -> Elasticsearch
4. App -> Logging service (daemon, http, file ...) -> Producer -> Queue -> Consumer -> Elasticsearch

 

이걸 다시 Elastic Stack 으로 변환 하면

 

Producer 는)

- Filebeat

- Logstash

 

Queue 는)

- Logstash persistent queue

 

Consumer 는)

- Logstash

 

이 외에도 sqs, dynamodb, redis, kafka, fluentd, storm 등 활용 가능한 오픈소스들이 많이 준비되어 있습니다.

 

가장 쉽고 일반적인 구성이라고 보시면 될 것 같습니다.

Trackbacks 0 : Comments 0

Write a comment


[Elasticsearch] minimum_master_nodes is not working.

Elastic/Elasticsearch 2019. 10. 17. 14:56

7.x breaking changes 에 올라가 있는 내용입니다.

기억력을 돕는 차원에서 기록합니다.

 

The discovery.zen.minimum_master_nodes setting is permitted, but ignored, on 7.x nodes.

 

해당 설정은 더 이상 사용하지 맙시다.

그리고 이제 master node 는 2개로 구성해서는 더 이상 위험해서 못쓰겠내요.

 

참고문서)

https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-7.0.html#breaking_70_discovery_changes

 

Breaking changes in 7.0 | Elasticsearch Reference [7.4] | Elastic

Reindex indices from Elasticsearch 5.x or before Indices created in Elasticsearch 5.x or before will need to be reindexed with Elasticsearch 6.x in order to be readable by Elasticsearch 7.x.

www.elastic.co

 

Trackbacks 0 : Comments 0

Write a comment


[Elasticsearch] Elasticsearch Cluster Auto Scaling 구성 하기.

Elastic/Elasticsearch 2019. 10. 2. 16:04

기억하기 위해 작성해 봅니다.

 

ES Cluster 를 아래와 같이 구성 한다고 가정 하고 들어 가겠습니다.

 

1. Master Node 2개

2. Coordinating Node 2개

3. Data Node 2개

 

ES Cluster 를 구성 하다 보면 CPU, MEM, DISK, NETWORK 등 모든 자원을 다 효율적으로 사용하도록 구성 하기는 매우 어렵습니다.

그렇다 보니 CPU 는 부족한데 MEM 은 남는 다던가 MEM 은 부족한데 CPU 는 남는 다던가 또는 가끔 말도 안되게 NETWORK Bandwidth 가 부족 할 때도 나오더군요.

 

암튼 그래서 모든 자원을 다 쥐어 짜듯이 구성 할 수 없으니 적당히 포기 하시길 권장 드립니다.

 

여기서 Auto Scaling 하는 구성은 Coordinating Node 와 Data node 이 두 가지 입니다.

 

우선 ES 는 너무 쉽게 Node 를 추가 하고 삭제 할 수 있습니다.

elasticsearch.yml 파일 내 Master Node 정보만 등록 해 두시면 됩니다.

 

- discovery.zen.ping.unicast.hosts: MASTER_NODE_LIST

 

설정 공식 문서는 아래 링크 참고 하세요.

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/modules-discovery-zen.html

 

Zen Discovery | Elasticsearch Reference [6.2] | Elastic

The zen discovery is the built in discovery module for Elasticsearch and the default. It provides unicast discovery, but can be extended to support cloud environments and other forms of discovery. The zen discovery is integrated with other modules, for exa

www.elastic.co

 

Coordinating Node Auto Scaling  구성 하기)

0. 가장 먼저 하셔야 하는 작업은 elasticsearch.yml 파일 내 Master Node 정보 등록 입니다.

1. 우선 Coordinating Node 들을 LoadBalancer 로 묶습니다. (AWS 를 사용 하시면 ALB/ELB 로 구성 하시면 됩니다.)

    - Coordinating Node 의 경우 "Too many requests" 오류가 발생 할 경우 이를 예방 하기 위해 Auto Scaling 구성을 하면 좋습니다.

    - 보통 트래픽이 몰릴 경우 search thread 가 부족 할 수 있기 때문 입니다.

    - LB 로 묶는 또 다른 이유는 API 단에서 단일 Endpoint 를 바라 보게 해서 운영을 편하게 하기 위함 입니다.

2. Auto Scaling 설정 구성을 합니다.

    - 이 부분은 사용하시는 환경에 맞춰서 구성 하시면 됩니다.

    - 공식 문서를 참고 하시면 됩니다.

    - https://docs.aws.amazon.com/ko_kr/autoscaling/ec2/userguide/scaling_plan.html#scaling_typesof

 

Auto Scaling 그룹의 크기 조정 - Amazon EC2 Auto Scaling

Auto Scaling 그룹의 크기 조정 조정은 애플리케이션의 컴퓨팅 용량을 늘리거나 줄이는 기능입니다. 조정은 이벤트와 함께 시작되거나 Auto Scaling 그룹에 Amazon EC2 인스턴스를 시작 또는 종료하도록 지시하는 조정 작업과 함께 시작됩니다. Amazon EC2 Auto Scaling에서는 여러 가지 방법으로 애플리케이션의 요구 사항에 가장 적합하게 조정 기능을 조절할 수 있습니다. 그러므로 애플리케이션을 충분히 이해하는 것이 중요합니다.

docs.aws.amazon.com

3. 테스트 해보시고 서비스에 적용 하시면 됩니다.

 

 

Data Node Auto Scaling 구성 하기) 

0. 가장 먼저 하셔야 하는 작업은 elasticsearch.yml 파일 내 Master Node 정보 등록 입니다.

0. 더불어 Replica Shard 설정은 (Data Node - 1) 만큼 구성이 되어야 합니다.

0. 더불어 초기 Data Node 수와 Primary Shard 크기를 잘 정의 하셔야 합니다.

0. 이유는 Primary Shard 는 한번 생성 하게 되면 ReIndexing 을 하지 않고서는 변경이 불가능 합니다.

1. Auto Scaling 설정 구성을 합니다.

    - 이 설정 시 Shard reallocation 에 대한 이슈는 없는지 꼭 확인을 하셔야 합니다.

    - CPU, DISk I/O, NETWORK 등에 대한 성능 저하가 발생 할 수 있습니다.

 

1. Auto Scaling 설정 후 수동으로 Shard Allocation 이 가능 합니다.

    - 보통 이런 문제는 트래픽이 몰리거나 운영 시점에 발생을 하기 때문에 미리 스크립트 구성해 놓으시고 re-location  진행 하시면 됩니다.

    - 그래서 Shard Allocation 설정을 꺼 두셔야 합니다.

    - 공식 문서를 참고 하시면 됩니다.

    - https://www.elastic.co/guide/en/elasticsearch/reference/current/shards-allocation.html

 

Cluster level shard allocation | Elasticsearch Reference [7.4] | Elastic

Regardless of the result of the balancing algorithm, rebalancing might not be allowed due to forced awareness or allocation filtering.

www.elastic.co

2. Data Node 의 경우 Auto Scaling 종료 정책에 따라 Primary Shard 가 유실 되지 않도록 구성 하는 것이 좋습니다.

    - 물론 Full Replicaiton 구성이라 다른 Replica 가 Primary Shard 로 선출 되겠지만 그래도 주의 하는게 좋겠죠.

 

마무리를 하면)

Coordinating Node 까지는 쉽게 적용이 가능 합니다.

하지만, Data Node 적용은 실제 운영 경험이 있고 문제 발생 시 신속하게 대응이 가능 하지 않다면 시도 하지 마시라고 말씀 드리고 싶습니다.

Trackbacks 0 : Comments 0

Write a comment