'2017/11'에 해당되는 글 10건

  1. 2017.11.29 [Java] JVM monitoring 도구
  2. 2017.11.24 [Java] Excutors 에서 제공하는 ExecutorService
  3. 2017.11.24 [Java] java.util.concurrent 관련 예제
  4. 2017.11.22 [Bash] for loop range and if condition
  5. 2017.11.21 [Hadoop] Native library installation on osx
  6. 2017.11.18 [Spark] Spark installation on osx
  7. 2017.11.15 [Elasticsearch] elasticsearch-arirang-analyzer-6.0.0 릴리즈
  8. 2017.11.14 [Lucene] Inverted index file - 역인덱스 파일
  9. 2017.11.14 [Elasticsearch] _id mapping 시 path 설정
  10. 2017.11.02 [Gradle] Dynamic version cache 설정하기

[Java] JVM monitoring 도구

ITWeb/개발일반 2017. 11. 29. 10:33


JMX 를 통한 JVM monitoring 도구

$ jvisualvm

실행 후 remote connection 을 하시고 접속 정보는 접속할 host 명과 JMX port 를 넣으시면 됩니다.

Tools -> Plugins 를 선택 하셔서 필요한 plugin 을 몽땅 설치 하시면 편리 합니다.

:

[Java] Excutors 에서 제공하는 ExecutorService

ITWeb/개발일반 2017. 11. 24. 15:36

구글링 하기 귀찮아서 소소 코드에 있는 주석이랑 코드 가져 왔습니다.

/**
* Creates a thread pool that reuses a fixed number of threads
* operating off a shared unbounded queue. At any point, at most
* {@code nThreads} threads will be active processing tasks.
* If additional tasks are submitted when all threads are active,
* they will wait in the queue until a thread is available.
* If any thread terminates due to a failure during execution
* prior to shutdown, a new one will take its place if needed to
* execute subsequent tasks. The threads in the pool will exist
* until it is explicitly {@link ExecutorService#shutdown shutdown}.
*
* @param nThreads the number of threads in the pool
* @return the newly created thread pool
* @throws IllegalArgumentException if {@code nThreads <= 0}
*/
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
/**
* Creates a thread pool that maintains enough threads to support
* the given parallelism level, and may use multiple queues to
* reduce contention. The parallelism level corresponds to the
* maximum number of threads actively engaged in, or available to
* engage in, task processing. The actual number of threads may
* grow and shrink dynamically. A work-stealing pool makes no
* guarantees about the order in which submitted tasks are
* executed.
*
* @param parallelism the targeted parallelism level
* @return the newly created thread pool
* @throws IllegalArgumentException if {@code parallelism <= 0}
* @since 1.8
*/
public static ExecutorService newWorkStealingPool(int parallelism) {
return new ForkJoinPool
(parallelism,
ForkJoinPool.defaultForkJoinWorkerThreadFactory,
null, true);
}
/**
* Creates a work-stealing thread pool using all
* {@link Runtime#availableProcessors available processors}
* as its target parallelism level.
* @return the newly created thread pool
* @see #newWorkStealingPool(int)
* @since 1.8
*/
public static ExecutorService newWorkStealingPool() {
return new ForkJoinPool
(Runtime.getRuntime().availableProcessors(),
ForkJoinPool.defaultForkJoinWorkerThreadFactory,
null, true);
}
/**
* Creates a thread pool that reuses a fixed number of threads
* operating off a shared unbounded queue, using the provided
* ThreadFactory to create new threads when needed. At any point,
* at most {@code nThreads} threads will be active processing
* tasks. If additional tasks are submitted when all threads are
* active, they will wait in the queue until a thread is
* available. If any thread terminates due to a failure during
* execution prior to shutdown, a new one will take its place if
* needed to execute subsequent tasks. The threads in the pool will
* exist until it is explicitly {@link ExecutorService#shutdown
* shutdown}.
*
* @param nThreads the number of threads in the pool
* @param threadFactory the factory to use when creating new threads
* @return the newly created thread pool
* @throws NullPointerException if threadFactory is null
* @throws IllegalArgumentException if {@code nThreads <= 0}
*/
public static ExecutorService newFixedThreadPool(int nThreads, ThreadFactory threadFactory) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>(),
threadFactory);
}
/**
* Creates an Executor that uses a single worker thread operating
* off an unbounded queue. (Note however that if this single
* thread terminates due to a failure during execution prior to
* shutdown, a new one will take its place if needed to execute
* subsequent tasks.) Tasks are guaranteed to execute
* sequentially, and no more than one task will be active at any
* given time. Unlike the otherwise equivalent
* {@code newFixedThreadPool(1)} the returned executor is
* guaranteed not to be reconfigurable to use additional threads.
*
* @return the newly created single-threaded Executor
*/
public static ExecutorService newSingleThreadExecutor() {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>()));
}
/**
* Creates an Executor that uses a single worker thread operating
* off an unbounded queue, and uses the provided ThreadFactory to
* create a new thread when needed. Unlike the otherwise
* equivalent {@code newFixedThreadPool(1, threadFactory)} the
* returned executor is guaranteed not to be reconfigurable to use
* additional threads.
*
* @param threadFactory the factory to use when creating new
* threads
*
* @return the newly created single-threaded Executor
* @throws NullPointerException if threadFactory is null
*/
public static ExecutorService newSingleThreadExecutor(ThreadFactory threadFactory) {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>(),
threadFactory));
}


/**
* Creates a thread pool that creates new threads as needed, but
* will reuse previously constructed threads when they are
* available. These pools will typically improve the performance
* of programs that execute many short-lived asynchronous tasks.
* Calls to {@code execute} will reuse previously constructed
* threads if available. If no existing thread is available, a new
* thread will be created and added to the pool. Threads that have
* not been used for sixty seconds are terminated and removed from
* the cache. Thus, a pool that remains idle for long enough will
* not consume any resources. Note that pools with similar
* properties but different details (for example, timeout parameters)
* may be created using {@link ThreadPoolExecutor} constructors.
*
* @return the newly created thread pool
*/
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}
/**
* Creates a thread pool that creates new threads as needed, but
* will reuse previously constructed threads when they are
* available, and uses the provided
* ThreadFactory to create new threads when needed.
* @param threadFactory the factory to use when creating new threads
* @return the newly created thread pool
* @throws NullPointerException if threadFactory is null
*/
public static ExecutorService newCachedThreadPool(ThreadFactory threadFactory) {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
threadFactory);
}
/**
* Creates a single-threaded executor that can schedule commands
* to run after a given delay, or to execute periodically.
* (Note however that if this single
* thread terminates due to a failure during execution prior to
* shutdown, a new one will take its place if needed to execute
* subsequent tasks.) Tasks are guaranteed to execute
* sequentially, and no more than one task will be active at any
* given time. Unlike the otherwise equivalent
* {@code newScheduledThreadPool(1)} the returned executor is
* guaranteed not to be reconfigurable to use additional threads.
* @return the newly created scheduled executor
*/
public static ScheduledExecutorService newSingleThreadScheduledExecutor() {
return new DelegatedScheduledExecutorService
(new ScheduledThreadPoolExecutor(1));
}
/**
* Creates a single-threaded executor that can schedule commands
* to run after a given delay, or to execute periodically. (Note
* however that if this single thread terminates due to a failure
* during execution prior to shutdown, a new one will take its
* place if needed to execute subsequent tasks.) Tasks are
* guaranteed to execute sequentially, and no more than one task
* will be active at any given time. Unlike the otherwise
* equivalent {@code newScheduledThreadPool(1, threadFactory)}
* the returned executor is guaranteed not to be reconfigurable to
* use additional threads.
* @param threadFactory the factory to use when creating new
* threads
* @return a newly created scheduled executor
* @throws NullPointerException if threadFactory is null
*/
public static ScheduledExecutorService newSingleThreadScheduledExecutor( ThreadFactory threadFactory) {
return new DelegatedScheduledExecutorService
(new ScheduledThreadPoolExecutor(1, threadFactory));
}
/**
* Creates a thread pool that can schedule commands to run after a
* given delay, or to execute periodically.
* @param corePoolSize the number of threads to keep in the pool,
* even if they are idle
* @return a newly created scheduled thread pool
* @throws IllegalArgumentException if {@code corePoolSize < 0}
*/
public static ScheduledExecutorService newScheduledThreadPool(int corePoolSize) {
return new ScheduledThreadPoolExecutor(corePoolSize);
}


/**
* Creates a thread pool that can schedule commands to run after a
* given delay, or to execute periodically.
* @param corePoolSize the number of threads to keep in the pool,
* even if they are idle
* @param threadFactory the factory to use when the executor
* creates a new thread
* @return a newly created scheduled thread pool
* @throws IllegalArgumentException if {@code corePoolSize < 0}
* @throws NullPointerException if threadFactory is null
*/
public static ScheduledExecutorService newScheduledThreadPool(
int corePoolSize, ThreadFactory threadFactory) {
return new ScheduledThreadPoolExecutor(corePoolSize, threadFactory);
}
/**
* Returns an object that delegates all defined {@link
* ExecutorService} methods to the given executor, but not any
* other methods that might otherwise be accessible using
* casts. This provides a way to safely "freeze" configuration and
* disallow tuning of a given concrete implementation.
* @param executor the underlying implementation
* @return an {@code ExecutorService} instance
* @throws NullPointerException if executor null
*/
public static ExecutorService unconfigurableExecutorService(ExecutorService executor) {
if (executor == null)
throw new NullPointerException();
return new DelegatedExecutorService(executor);
}
/**
* Returns an object that delegates all defined {@link
* ScheduledExecutorService} methods to the given executor, but
* not any other methods that might otherwise be accessible using
* casts. This provides a way to safely "freeze" configuration and
* disallow tuning of a given concrete implementation.
* @param executor the underlying implementation
* @return a {@code ScheduledExecutorService} instance
* @throws NullPointerException if executor null
*/
public static ScheduledExecutorService unconfigurableScheduledExecutorService( ScheduledExecutorService executor) {
if (executor == null)
throw new NullPointerException();
return new DelegatedScheduledExecutorService(executor);
}


  • newFixedThreadPool
    • 정해준 크기 만큼의 쓰레드를 생성하고 재사용 합니다. 명시적으로 shutdown() 하지 않는 한 쓰레드 중 하나가 종료 되면 다시 생성을 하게 됩니다.


  • newWorkStealingPool
    • 작업 순서에 대한 보장은 하지 않습니다, parallelism 수준에 따라 쓰레드를 충분히 지원 하지만 다중큐를 사용하는 것이 좋습니다. 쓰레드의 크기는 동적으로 늘었다 줄었다 합니다.


  • newSingleThreadExecutor
    • 쓰레드를 하나만 생성해서 사용합니다. 만약 종료 되면 다시 쓰레드가 생성이 되며 작업에 대한 연속성을 보장해 줍니다.


  • newCachedThreadPool
    • 필요한 만큼 쓰레드를 생성 하게 됩니다. 하지만 60초 동안 사용되지 않으면 풀에서 제거 됩니다.
    • 60초가 기본 설정 값 이며, 생성된 쓰레드는 재사용 됩니다.


  • newSingleThreadScheduledExecutor
    • 스케쥴링이 가능한 하나의 쓰레드를 생성 합니다. 스케쥴 기능을 빼고는 newSingleThreadExecutor 와 비슷 하다고 보시면 됩니다.


  • newScheduledThreadPool
    • 스케쥴링이 가능한 쓰레드 풀을 생성 합니다. 쓰레드가 idle 상태에 있더라도 종료 되거나 소멸 되지 않고 풀에 그대로 남아 있습니다.


:

[Java] java.util.concurrent 관련 예제

ITWeb/개발일반 2017. 11. 24. 15:03

구글링 하기 귀찮을 땐.

http://tutorials.jenkov.com/java-util-concurrent/index.html

http://winterbe.com/posts/2015/04/07/java8-concurrency-tutorial-thread-executor-examples/


그리고 java 8 관련해서는 위에 링크 중 아래 블로그 쥔장이 잘 정리해 둔것 같내요.

http://winterbe.com/java/

:

[Bash] for loop range and if condition

ITWeb/개발일반 2017. 11. 22. 11:16

이런 간단한것도 매번 생각이 나지 않아서 기록해 봅니다.


$ for i in {1..10}

do

s3cmd get s3://part-$i

if [ $i -gt 9 ]

then

break

fi

done


:

[Hadoop] Native library installation on osx

Elastic/Hadoop 2017. 11. 21. 15:54

hdfs 에서 snappy 압축 파일을 바로 읽으려고 하니 아래와 같은 오류 메시지가 발생을 했습니다.

hadoop 을 구성 할 때  source build 를 하지 않고 그냥 binary 를 가지고 사용해서 그런것 같아 source 를 받아서 build 를 하기로 했습니다.


[에러 메시지]

Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


$ hadoop checknative -a

Native library checking:

hadoop:  false

zlib:    false

snappy:  false

lz4:     false

bzip2:   false

openssl: false

처럼 나와서 빌드를 하게 되었습니다.


참고문서는 아래 링크를 보시면 됩니다.

Ref. 

https://medium.com/@faizanahemad/hadoop-native-libraries-installation-on-mac-osx-d8338a6923db

https://gist.github.com/zedar/f631ace0759c1d512573


brea install 을 통해서 필요한 몇 가지를 먼저 구성 하셔야 합니다.

$ brew install gcc autoconf automake libtool cmake snappy gzip bzip2 homebrew/versions/protobuf250 zlib openssl

참고로 저는 protobuf 3.4.0 이 설치되어 있어서 downgrade 했습니다.


pure hadoop build 는 maven 은 3.x 이상을 요구 합니다.

- hadoop-2.7.2 빌드 했습니다.


[빌드 및 native library 복사]

$ ../apache-maven-3.5.0/bin/mvn package -Pdist,native -DskipTests -Dtar -e

$ vi .bash_profile

export OPENSSL_ROOT_DIR=/usr/local/Cellar/openssl/1.0.2m

export OPENSSL_INCLUDE_DIR=/usr/local/Cellar/openssl/1.0.2m/include

export PROTOC_HOME=/usr/local/opt/protobuf@2.5

export HADOOP_HOME=/Users/henry/Work/apps/hadoop-2.7.2

PATH=$PROTOC_HOME/bin:$HADOOP_HOME/bin:$HOME/bin:$PATH


export PATH


$ cp -r hadoop-dist/target/hadoop-2.7.2/lib/native/* /Users/henry/Work/apps/hadoop-2.7.2/lib/native/osx/

$ vi etc/hadoop/hadoop-env.sh

export HADOOP_HOME="/Users/henry/Work/apps/hadoop-2.7.2"

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native/osx"

$ vi etc/hadoop/core-site.xml

    <property>

        <name>io.compression.codecs</name>                  <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.BZip2Codec</value>

    </property>

여기 까지 하고 나서 아래 명령어를 다시 실행해 봅니다.

$ hadoop checknative -a

17/11/21 15:45:16 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version

17/11/21 15:45:16 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library

Native library checking:

hadoop:  true /Users/henry/Work/apps/hadoop-2.7.2/lib/native/osx/libhadoop.dylib

zlib:    true /usr/lib/libz.1.dylib

snappy:  true /usr/local/lib/libsnappy.1.dylib

lz4:     true revision:99

bzip2:   false

openssl: false build does not support openssl.


Hadoop source build 하다 보면 아래 에러가 발생을 합니다.

[INFO] Apache Hadoop Pipes ................................ FAILURE [  0.627 s]

이 경우 아래 문서 참고해서 해결 하시면 됩니다.

Ref. 

https://stackoverflow.com/questions/36818957/mac-hadoop-2-7-failed-to-execute-goal-org-apache-maven-pluginsmaven-antrun


$ cd hadoop/hadoop-tools/hadoop-pipes

$ vi pom.xml

...중략...

<arg line="${basedir}/src/ -DJVM_ARCH_DATA_MODEL=64"/>

...중략...

-> 여기서 64를 6으로 변경해 주시기 바랍니다.


:

[Spark] Spark installation on osx

ITWeb/개발일반 2017. 11. 18. 21:54

기본 설치 입니다.

제가 brew install 보다는 직접 binary 받아서 설치 하는걸 더 선호해서 올려 봅니다.

[Spark installation on osx]

Ref. https://isaacchanghau.github.io/2017/06/28/Spark-Installation-on-Mac-OS-X/


1. scala

https://www.scala-lang.org/download/


$ tar -xvzf scala-2.12.4.tgz

$ vi .bash_profile

export SCALA_HOME=/Users/henry/Work/apps/scala-2.12.4

PATH=$SCALA_HOME/bin:$PATH

2. spark

https://spark.apache.org/downloads.html


$ tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz

$ vi .bash_profile

export SPARK_HOME=/Users/henry/Work/apps/spark-2.2.0-bin-hadoop2.7

PATH=$SPARK_HOME/bin:$PATH

$ cd /Users/henry/Work/apps/spark-2.2.0-bin-hadoop2.7

$ cp spark-env.sh.template spark-env.sh

$ vim spark-env.sh

export SCALA_HOME=/Users/henry/Work/apps/scala-2.12.4

export SPARK_MASTER_IP=localhost

export SPARK_WORKER_MEMORY=1g

$ spark-shell


scala 와 spark 이 설치가 되어야 합니다.

보시면 아시겠지만 .bash_profile 에 환경 설정 해주시고 실행하시면 됩니다.

PATH=$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH

$ source .bash_profile

$ spark-shell

Spark context Web UI available at http://127.0.0.1:4040

Spark context available as 'sc' (master = local[*], app id = local-1511009262380).

Spark session available as 'spark'.

Welcome to

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0

      /_/


Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_72)

Type in expressions to have them evaluated.

Type :help for more information.


scala>


:

[Elasticsearch] elasticsearch-arirang-analyzer-6.0.0 릴리즈

Elastic/Elasticsearch 2017. 11. 15. 23:49

페북에 올렸더니 스팸 이라고 삭제 당했내요. ㅡ.ㅡ;

https://github.com/HowookJeong/elasticsearch-analysis-arirang/tree/6.0.0

https://github.com/HowookJeong/elasticsearch-analysis-arirang/releases/download/6.0.0/elasticsearch-analysis-arirang-6.0.0.zip


설치 방법은 잘 아시겠지만 두 가지 입니다.

$ bin/elasticsearch-plugin install file:///elasticsearch-analysis-arirang-6.0.0.zip

$ bin/elasticsearch-plugin install https://github.com/HowookJeong/elasticsearch-analysis-arirang/releases/download/6.0.0/elasticsearch-analysis-arirang-6.0.0.zip


적용된 version 은 아래와 같습니다.

elasticsearch-6.0.0

lucene-7.0.1

arirang.lucene-analyzer-7.0.1

arirang.morph-1.1.0


혹시 arirang plugin 을 어떻게 만드는지 궁금하신 분들은 아래 글 참고하세요.

[Elasticsearch] Arirang Analyzer + Elasticsearch Analyzer Plugin 사용자 관점 개발리뷰


:

[Lucene] Inverted index file - 역인덱스 파일

ITWeb/검색일반 2017. 11. 14. 23:15

루씬에서 검색을 하기 위해 필요한 파일을 살짝 알아보겠습니다.

파일 구조와 목록은 아래 문서를 참고 하시기 바랍니다.

Lucene Index File Formats)

https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html#package.description


그럼 실제 검색을 위해 보셔야 하는 기본이 되는 클래스는 

  • IndexSearcher
  • IndexReader
  • CollectionStatistics
  • TermStatistics

이렇게 4개 정도 보시면 될 것 같습니다.


검색을 위해 필요한 정보는

  • Documents
  • Fields
  • Terms
  • FieldInvertState

이렇게 4개 정도가 필요 합니다.

딱 봐도 "searchField:elasticsearch" 하면 

  • searchField 라는 field 정보가 필요하고, 
  • elasticsearch 라는 term 관련 정보도 필요하고, 
  • elasticsearch 라는 term 이 있는 document 정보도 필요하고,
  • 해당 field 에서의 term 이 추출 된 offset과 position 정보가

필요합니다.


이걸 정리한 이유는 오늘 누가 custom function score query 를 사용하여 다수의 field 에 대한 ranking term boosting 기능을 사용하고 있는데 성능적으로 개선 할 수 있는 방법이 없는지 물어봐서 간단하게 정리해봤습니다.

Query 튜닝은 한계가 반드시 존재 합니다.

서버의 구조적인 개선과 튜닝을 병행해야 하며 다수의 field 에 대한 다수의 term boosting 은 최적화를 통해 최소화 해서 사용하는걸 추천 드립니다.

그리고 inverted index file 이라는 것은 루씬에서 하나의 파일만 이야기 하는 것이 아니라 lucene 이 가지고 있는 index file 목록들이 inverted index file 을 구성 한다고 보시면 될 것 같습니다.

:

[Elasticsearch] _id mapping 시 path 설정

Elastic/Elasticsearch 2017. 11. 14. 11:13

_id 에 사용하시는 데이터의 primary key 값을 지정 하고 싶을때가 많이 있습니다.

기억이 가물가물해서 잠시 찾아 봤는데요.

2.4 까지는 path 설정 기능이 살아 있었는데 5.X 들어 가면서 삭제 되었습니다.


2.4)

private String path = Defaults.PATH;


public Builder() {

    super(Defaults.NAME, new FieldType(Defaults.FIELD_TYPE));

    indexName = Defaults.INDEX_NAME;

}


public Builder path(String path) {

    this.path = path;

    return builder;

}


그래서 _id field 에 primary key 를 넣고 싶으실 경우  IndexRequestBuilder.setId() 를 이용하시거나 JSON 파일 만드실 때 _id field 에 primary key 값을 넣어 주시면 됩니다.


:

[Gradle] Dynamic version cache 설정하기

ITWeb/개발일반 2017. 11. 2. 21:45

지난 글과 함께 보시면 좋습니다.


[지난 글]

[Gradle] Dependency filename 구하기

[Gradle] Use latest version on dependency jar.


[참고문서]

https://docs.gradle.org/current/dsl/org.gradle.api.artifacts.ResolutionStrategy.html


[설정]

configurations {

dependencyFn

}


configurations.dependencyFn {

resolutionStrategy.cacheDynamicVersionsFor 10, 'minutes'

}


dependencies {

dependencyFn group: "org.apaceh.lucene.analysis.arirang", name: "arirang-dictionary", \

version: "+", changing: true

}


: