Using Elasticsearch

Prev Next

Available in VPC

Elasticsearch is a data storage optimized for search, storing unstructured data in JSON format. You can use REST API to query.

Check Elasticsearch app details

Once the app is created, you can view its details. If the Status in the app details is Stable, the app is running normally.
To view app details:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Data Forest > Apps on the left.
  3. Select the account that owns the app.
  4. Click the app to view its details.
  5. Review the app details.
    df-es_2-1_updated_ko
    • Quick links
      • elasticsearch.hosts: Elasticsearch REST API address (external network)
      • shell-es-coord-0: Coordinating node's web shell URL
      • supervisor-es-ingest-0: Supervisor URL for managing ingest node processes, not created by default and unavailable unless specified
      • shell-es-master-0: Master node's web shell URL
      • supervisor-es-coord-0: Supervisor URL for managing coordinating node processes
      • supervisor-es-data-0: Supervisor URL for managing data node processes
      • shell-es-data-0: Data node's web shell URL
      • shell-es-ingest-0: Web shell URL for the ingest node, not created by default and unavailable unless specified
      • supervisor-es-master-0: Supervisor URL for managing master node processes
    • Connection String
      • elasticsearch.hosts.inside-of-cluster: Elasticsearch REST API address (internal network)
    • Component: The default values are the recommended resources.
      • es-master: Component that performs the master role on the es server
      • es-data: Component that performs the data storage role on the es server
      • es-ingest: Component that performs the ingest pipeline role on the es server
      • es-coord: Component that performs the coordinating role on the es server

REST API address list

Elasticsearch has 2 HTTP REST API addresses. Use the appropriate API address depending on the situation.

  • elasticsearch.hosts: Use this to access Elasticsearch from outside the Data Forest network.
  • elasticsearch.hosts.inside-of-cluster: Use this to access Elasticsearch from within Data Forest apps, such as when integrating with Kibana.
Cases Address to use
When access is made from Kibana app to Elasticsearch app elasticsearch.hosts.inside-of-cluster
When access is made from a development environment built on the Dev app to the Elasticsearch app elasticsearch.hosts.inside-of-cluster
Logstash that runs on its own server elasticsearch.hosts
curl running on servers outside Data Forest elasticsearch.hosts
Note

The elasticsearch.hosts address is an HTTP proxy server. If elasticsearch.hosts.inside-of-cluster is available, using elasticsearch.hosts.inside-of-cluster is recommended over using an HTTP proxy server.

Change the number of Elasticsearch app containers

To adjust the number of containers:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Data Forest > Apps on the left.
  3. Select the account that owns the app.
  4. Click the app to view the details, and click [Flex].
  5. When the flex changing window appears, edit the number of containers, and click [Edit].
    df-es_4-1_vpc_updated_ko
Note
  • es-master and es-data components don't support the Flex feature. Other components can be flexed freely.

Cautions for using Elasticsearch app

Node failure

Elasticsearch app runs only 1 data node for 1 physical node. Therefore, if replica = 1, the data isn't lost, even if a failure occurs in 1 data node (replica = 1 signifies that there's 1 original and 1 replica). At least 2 machines are required for es-data.

Note
  • Elasticsearch 7.3's default values are 1 shard and 1 replica. For more information, see "Index settings" or "Index creation no longer defaults to five shards."
  • Elasticsearch 6.7's default values are 5 shards and 1 replica.

Out of Memory

When creating the Elasticsearch app, you specify the required resource and memory. If the process uses more memory than what is specified, then the OS's OOM Killer kills the Elasticsearch process. Because Elasticsearch processes are managed by supervisors, if a process is killed due to OOM, the supervisor will start the Elasticsearch process again, but it is highly likely that OOM will reoccur after its previous occurrence.
In such cases, you must increase the number of Elasticsearch containers and split the shards to reduce the load for each node, or increase the container resources. However, you can't change resources in the Elasticsearch app that's already running. If a change to resources is required, you can create the Elasticsearch app again with the increased memory, back it up to HDFS by referring to repository-hdfs, and then restore it, or use the Reindex from remote feature to transfer the index to the new Elasticsearch app.

To check if the Elasticsearch process has been restarted after being killed due to OOM:

  • Check the node's uptime information in cerebro. If the uptime is shorter than other nodes, then it's likely that the Elasticsearch process has been restarted.
  • Check the node's supervisord.log. (Check logs and settings) If the log has the following content, it's likely the Elasticsearch process has been restarted due to OOM.
    # The Elasticsearch process was killed by SIGKILL.
    2020-04-22 14:44:35,390 INFO exited: elasticsearch (terminated by SIGKILL; not expected)
    
    # The supervisor started Elasticsearch again.
    2020-04-22 14:44:36,395 INFO spawned: 'elasticsearch' with pid 508
    2020-04-22 14:44:37,396 INFO success: elasticsearch entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    

The supervisor's log indicates that the Elasticsearch process has been killed with SIGKILL, but the reason for SIGKILL's occurrence is unknown. OOM Killer uses SIGKILL, so it is assumed that it has been killed by the OOM Killer.

  • You can use the dmesg command to check if OOM occurred.

    [magnum@ac3m8x2240.bdp ~]$ dmesg
    
    ...
    [13460863.662062] elasticsearch[e invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=1000
    [13460863.662066] elasticsearch[e cpuset=aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842 mems_allowed=0-1
    [13460863.662069] CPU: 14 PID: 27043 Comm: elasticsearch[e Not tainted 3.10.0-862.14.4.el7.x86_64 #1
    [13460863.662071] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 09/12/2019
    [13460863.662072] Call Trace:
    [13460863.662080]  [<ffffffffb7f13754>] dump_stack+0x19/0x1b
    [13460863.662085]  [<ffffffffb7f0e91f>] dump_header+0x90/0x229
    [13460863.662091]  [<ffffffffb799a7e6>] ? find_lock_task_mm+0x56/0xc0
    [13460863.662096]  [<ffffffffb7a0f678>] ? try_get_mem_cgroup_from_mm+0x28/0x60
    [13460863.662098]  [<ffffffffb799ac94>] oom_kill_process+0x254/0x3d0
    [13460863.662101]  [<ffffffffb7a13486>] mem_cgroup_oom_synchronize+0x546/0x570
    [13460863.662103]  [<ffffffffb7a12900>] ? mem_cgroup_charge_common+0xc0/0xc0
    [13460863.662105]  [<ffffffffb799b524>] pagefault_out_of_memory+0x14/0x90
    [13460863.662108]  [<ffffffffb7f0cac1>] mm_fault_error+0x6a/0x157
    [13460863.662112]  [<ffffffffb7f20846>] __do_page_fault+0x496/0x4f0
    [13460863.662113]  [<ffffffffb7f208d5>] do_page_fault+0x35/0x90
    [13460863.662115]  [<ffffffffb7f1c758>] page_fault+0x28/0x30
    [13460863.662118] Task in /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842 killed as a result of limit of /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842
    [13460863.662121] memory: usage 2097152kB, limit 2097152kB, failcnt 1448215
    [13460863.662122] memory+swap: usage 2097152kB, limit 4194304kB, failcnt 0
    [13460863.662123] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
    [13460863.662124] Memory cgroup stats for /yarn/container_e62_1584425817444_92362_01_000002/aceca4b30ed9b4af3b3fab0ada711021f05dfecf68ebfdcfcdaa54a0ab516842: cache:332KB rss:2096820KB rss_huge:0KB mapped_file:320KB swap:0KB inactive_anon:546160KB active_anon:1550660KB inactive_file:8KB active_file:0KB unevictable:0KB
    [13460863.662138] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
    [13460863.662272] [25374] 20033 25374     3151       67      12        0             0 bash
    [13460863.662274] [25813] 20033 25813    29910     2391      62        0             0 supervisord
    [13460863.662276] [25818] 20033 25818   178152     1221      20        0             0 gotty
    [13460863.662279] [26644] 20033 26644     3184      108      12        0             0 bash
    [13460863.662282] [26703] 20033 26703   954936   519887    1130        0          1000 java
    [13460863.662285] Memory cgroup out of memory: Kill process 27043 (elasticsearch[e) score 1993 or sacrifice child
    [13460863.663561] Killed process 26703 (java) total-vm:3819744kB, anon-rss:2079548kB, file-rss:0kB, shmem-rss:0kB
    

Data retention

The Elasticsearch app's data is saved under /home1/{USER}, and is not deleted even when you stop the app, or the app is ended due to other problems.

Data may be lost in certain situations as follows.
The data is deleted when you destroy the stopped app. When you start the stopped app again, it runs on the node where the data from the previous session exists, and the data is recovered. However, if other jobs are occupying the resources in the node where it ran, or the node has been excluded from service due to failure, it runs on another node after yarn.service.placement-history.timeout.ms (default: 1 hour). So, the data of the Elasticsearch app that has not run during the time set in yarn.service.placement-history.timeout.ms may get lost. Back up important data to HDFS by referring to repository-hdfs.

Note

Run the Elasticsearch app on a dedicated queue if possible, in case it directly impacts the service. Common queues are shared by multiple users, so the required resource may not be secured at a point in time that you want.