Using Elasticsearch

release/20240425
English

Using Elasticsearch

Article Summary

Share feedback

Thanks for sharing your feedback!

Available in VPC

Elasticsearch is a data storage specialized for search that saves unstructured data in JSON structures. You can use REST API to query.

Check Elasticsearch app details

When the app creation is completed, you can view the details. When the Status is Stable under the app's details, it means the app is running normally.
The following describes how to check the app details.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus in order.
Click the Data Forest > Apps menu on the left.
Select the account that owns the app.
Click the app whose details you want to view.
View the app details.
- Quick links
  - AppMaster: A URL to view container logs. When creating an app, all apps are submitted to the YARN queue, and YARN provides a web UI to check detailed information of each app.
  - elasticsearch.hosts: Elasticsearch REST API address (external network)
  - shell-es-coord-0: The coordinating node's web shell URL
  - supervisor-es-ingest-0: supervisor URL that can manage ingest node processes. This node is created with the default value of 0, so it is not accessible if not specified.
  - shell-es-master-0: master node's web shell URL
  - supervisor-es-coord-0: supervisor URL that can manage coordinating node processes
  - supervisor-es-data-0: supervisor URL that can manage data node processes
  - shell-es-data-0: data node's web shell URL
  - shell-es-ingest-0: ingest node's web shell URL. This node is created with the default value of 0, so it is not accessible if not specified.
  - supervisor-es-master-0: supervisor URL that can manage master node processes

Connection String
- elasticsearch.hosts.inside-of-cluster: Elasticsearch REST API address (internal network)
Component: value specified by default is the recommended resource.
- es-master: component that plays the role of master in the es server
- es-data: component that plays the role of saving data in the es server
- es-ingest: component that plays the role of the ingest pipeline in the Elasticsearch server
- es-coord: component that plays the role of coordinator in the es server

Note

For information on how to log in to the AppMaster UI and view the logs of each container, see Access quick links.

REST API address list

Elasticsearch has two HTTP REST API addresses. Use the appropriate API address, depending on the situations.

elasticsearch.hosts: used when accessing Elasticsearch from outside the Data Forest network.
elasticsearch.hosts.inside-of-cluster: used when accessing Elasticsearch from the Data Forest app. Example) When connecting Kibana to Elasticsearch

Situation	Address to use
When access is made from Kibana app to Elasticsearch app	elasticsearch.hosts.inside-of-cluster
When access is made from a development environment built on the Dev app to the Elasticsearch app	elasticsearch.hosts.inside-of-cluster
Logstash that runs on its own server	elasticsearch.hosts
`curl` running on servers outside Data Forest	elasticsearch.hosts

Note

The elasticsearch.hosts address is an HTTP proxy server. If your environment permits the use of elasticsearch.hosts.inside-of-cluster, using elasticsearch.hosts.inside-of-cluster is more recommended than using an HTTP proxy server.

Change the number of Elasticsearch app containers

The following describes how to adjust the number of containers.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus in order.
Click the Data Forest > Apps menu on the left.
Select the account that owns the app.
Click the app to view the details, and click [Flex].
When the flex changing window appears, modify the number of containers, and click [Modify].

Note

es-master and es-data component don't support the Flex feature. Other components can be flexed freely.

Precautions for using Elasticsearch app

Node failure

Elasticsearch app runs only one data node for one physical node. Therefore, if replica = 1, the data isn't lost, even if a failure occurs in one data node (replica = 1 signifies that there's one original and one replica). At least two machines are required for es-data.

Note

Elasticsearch 7.3's default values are 1 shard and 1 replica. Refer to "Index settings" or "Index creation no longer defaults to five shards."
Elasticsearch 6.7's default values are 5 shards and 1 replica.

Out of Memory

When creating the Elasticsearch app, you specify required resource/memory. If the process uses more memory than what is specified, then the OS's OOM Killer kills the Elasticsearch process. Since Elasticsearch processes are managed with supervisors, if a process is killed due to OOM, the supervisor will start the Elasticsearch process again, but it is highly likely that OOM will reoccur after its previous occurrence.
In such cases, you should increase the number of Elasticsearch containers and split the shards to reduce the load for each node, or increase the container resources. However, you can't change resources in the Elasticsearch app that's already running. If a change to resources is required, you can create the Elasticsearch app again with the increased memory, back it up to HDFS by referring to repository-hdfs, and then restore it, or use the Reindex from remote feature to transfer the index to the new Elasticsearch app.

The following describes how to check if the Elasticsearch process has been restarted after being killed due to OOM.

Check the node's uptime information in cerebro. If the uptime is shorter than other nodes, then it's likely that the Elasticsearch process has been restarted.

Check the node's supervisord.log. (Check the log/settings) If the log has the following content, it's likely the Elasticsearch process has been restarted due to OOM.

# The Elasticsearch process was killed by SIGKILL.
2020-04-22 14:44:35,390 INFO exited: elasticsearch (terminated by SIGKILL; not expected)

# The supervisor started Elasticsearch again.
2020-04-22 14:44:36,395 INFO spawned: 'elasticsearch' with pid 508
2020-04-22 14:44:37,396 INFO success: elasticsearch entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

The supervisor's log indicates that the Elasticsearch process has been killed with SIGKILL, but the reason for SIGKILL's occurrence is unknown. OOM Killer uses SIGKILL, it is assumed that it has been killed by the OOM Killer.

You can use the dmesg command to see if OOM occurred.

[magnum@ac3m8x2240.bdp ~]$ dmesg

...
[13460863.662062] elasticsearch[e invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=1000
[13460863.662066] elasticsearch[e cpuset=aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842 mems_allowed=0-1
[13460863.662069] CPU: 14 PID: 27043 Comm: elasticsearch[e Not tainted 3.10.0-862.14.4.el7.x86_64 #1
[13460863.662071] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 09/12/2019
[13460863.662072] Call Trace:
[13460863.662080]  [<ffffffffb7f13754>] dump_stack+0x19/0x1b
[13460863.662085]  [<ffffffffb7f0e91f>] dump_header+0x90/0x229
[13460863.662091]  [<ffffffffb799a7e6>] ? find_lock_task_mm+0x56/0xc0
[13460863.662096]  [<ffffffffb7a0f678>] ? try_get_mem_cgroup_from_mm+0x28/0x60
[13460863.662098]  [<ffffffffb799ac94>] oom_kill_process+0x254/0x3d0
[13460863.662101]  [<ffffffffb7a13486>] mem_cgroup_oom_synchronize+0x546/0x570
[13460863.662103]  [<ffffffffb7a12900>] ? mem_cgroup_charge_common+0xc0/0xc0
[13460863.662105]  [<ffffffffb799b524>] pagefault_out_of_memory+0x14/0x90
[13460863.662108]  [<ffffffffb7f0cac1>] mm_fault_error+0x6a/0x157
[13460863.662112]  [<ffffffffb7f20846>] __do_page_fault+0x496/0x4f0
[13460863.662113]  [<ffffffffb7f208d5>] do_page_fault+0x35/0x90
[13460863.662115]  [<ffffffffb7f1c758>] page_fault+0x28/0x30
[13460863.662118] Task in /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842 killed as a result of limit of /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842
[13460863.662121] memory: usage 2097152kB, limit 2097152kB, failcnt 1448215
[13460863.662122] memory+swap: usage 2097152kB, limit 4194304kB, failcnt 0
[13460863.662123] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[13460863.662124] Memory cgroup stats for /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842: cache:332KB rss:2096820KB rss_huge:0KB mapped_file:320KB swap:0KB inactive_anon:546160KB active_anon:1550660KB inactive_file:8KB active_file:0KB unevictable:0KB
[13460863.662138] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[13460863.662272] [25374] 20033 25374     3151       67      12        0             0 bash
[13460863.662274] [25813] 20033 25813    29910     2391      62        0             0 supervisord
[13460863.662276] [25818] 20033 25818   178152     1221      20        0             0 gotty
[13460863.662279] [26644] 20033 26644     3184      108      12        0             0 bash
[13460863.662282] [26703] 20033 26703   954936   519887    1130        0          1000 java
[13460863.662285] Memory cgroup out of memory: Kill process 27043 (elasticsearch[e) score 1993 or sacrifice child
[13460863.663561] Killed process 26703 (java) total-vm:3819744kB, anon-rss:2079548kB, file-rss:0kB, shmem-rss:0kB

Data retention

The Elasticsearch app's data is saved under /home1/{USER}, and is not deleted even when the user stops the app, or the app is ended due to other problems.

Data may be lost in certain situations as follows.
The data is deleted when the user destroys the stopped app. When the stopped app is started again, it runs on the node where the data from the previous session exists, and the data is recovered. However, if other jobs are occupying the resources in the node where it ran, or the node has been excluded from service due to failure, it runs on another node after yarn.service.placement-history.timeout.ms (default: 1 hour). So, the data of the Elasticsearch app that has not run during the time set in yarn.service.placement-history.timeout.ms may get lost. Back up important data to HDFS by referring to repository-hdfs.

Note

Run the Elasticsearch app on a dedicated queue if possible, in case it directly impacts the service. Common queues are shared by multiple users, so the required resource may not be secured at a point in time that you want.

Was this article helpful?

What's Next

Using Grafana

Table of contents

Check Elasticsearch app details
Change the number of Elasticsearch app containers
Precautions for using Elasticsearch app