Hadoop monitoring with Cloud Insight
    • PDF

    Hadoop monitoring with Cloud Insight

    • PDF

    Article Summary

    Available in VPC

    You can use NAVER Cloud Platform's Cloud Insight to monitor Hadoop's performance and operation indicators, and quickly check and respond in case of failures.

    Preparations

    1. Create a Cloud Hadoop cluster.
    2. Request subscription to Cloud Insight.
      • Please refer to the Cloud Insight Guide for more information about requesting subscription to Cloud Insight.

    Dashboard configuration

    If you've completed preparations, then you can create a dashboard in the Cloud Insight console screen and add widgets for monitoring Cloud Hadoop.

    The following describes how to create Cloud Insight dashboard and add widgets for monitoring Cloud Hadoop.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.
    2. Click the [Create dashboard] button.
    3. Enter the dashboard name and description, and then click the [Create] button.
      hadoop-vpc-use-ex12_create1_vpc_en.png
    4. Click the [Add widget] button.
    5. Enter the widget's name, select the widget type, and then click the [Next] button.
      • This example uses the Time Series widget for the explanation.
        hadoop-vpc-use-ex12_create3_vpc_en.png
    6. Enter the widget settings as below, and then click the [Next] button.
      hadoop-vpc-use-ex12_create4_vpc_en.png
      • Product Type: Cloud Hadoop(VPC)
      • Target: Select All available resources, and then select the cluster to monitor
        (Refer to Set target group when selecting group)
      • Metric: Select All metrics, select items to monitor, and then click the [Add selected item] button
        (Refer to Set rule template when selecting template)
      • Set data list: Select Dimension (properties), Interval (aggregation interval), and Aggregation (aggregation function) of the selected monitoring item
    7. After checking the set widget details, click the [Create] button.
      hadoop-vpc-use-ex12_create6_vpc_en.png
      • The widget will be added to the dashboard. You can monitor the Cloud Hadoop cluster through the added widget.
        hadoop-vpc-use-ex12_create7_vpc_en.png

    Group and template settings

    You can group specific monitoring targets or save specific monitoring items (metrics) as a template so you can manage monitoring settings and widgets easily.

    Set target group

    The following describes how to create a target group and group specific monitoring targets.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.
    2. Click the Configuration > Template menus, in that order.
    3. Click the [Target Group] tab, and then click the [Create target group] button.
    4. Enter the group settings as below, and then click the [Create] button.
      hadoop-vpc-use-ex12_targetGroup2_vpc_en.png
      • Product Type: Cloud Hadoop(VPC)
      • Group name, Group description: Enter the group name and description
      • Available monitor targets: Select all monitoring targets to include in the group, and then click icon_hadoop-vpc-use-ex12_addSelected_vpc_ko.png

    Set rule template

    The following describes how to set a rule template and save specific monitoring items as a template.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.

    2. Click the Configuration > Template menus, in that order.

    3. Click the [Rule Template] tab, and then click the [Create rule template] button.

    4. Enter the template settings as below, and then click the [Next] button.
      hadoop-vpc-use-ex12_template2_vpc_en.png

      • Product Type: Cloud Hadoop(VPC)
      • Template name, Description: Enter the template name and description
      • Select the monitoring items (metrics) to include in the template from each category tab
    5. Enter the monitoring conditions of each monitoring item by referring to below, and then click the [Save] button.
      hadoop-vpc-use-ex12_template3_vpc_en.png

      • Dimension: Properties of the monitoring item
      • Level: Level of the event in case of event occurring
      • Condition: Event triggering condition
      • Method: Aggregation function of the monitoring item
      • Duration: The time of duration
      Note

      The following is an example of setting an Info level event trigger if the Cloud Hadoop (VPC)'s CPU/user_rto (cpu_idx:1) value continues to be 0 for one minute.

      hadoop-vpc-use12_25_ko

    Set event

    You can select monitoring targets and items, create events by setting monitoring conditions and notification actions, and view the status of created events.

    Note

    This guide explains how to use Send notification message as an event's notification action. Please refer to Cloud Insight Guide for information about other notification actions including Integration, Cloud Functions, and Auto Scaling policies.

    The following shows how to set an event.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.
    2. Click the Configuration > Event Rule menus, in that order.
    3. Click the [Event Rules] button.
    4. Select Cloud Hadoop (VPC) from the Select monitored product item, and then click the [Next] button.
      hadoop-vpc-use-ex12_eventRule2_vpc_en.png
    5. Select the individual monitoring target or monitoring group, and then click the [Next] button.
    6. Select the individual monitoring item or monitoring template, and then click the [Next] button.
    7. Select the notification recipient group from the [Send notification message] tab, and then click the [Next] button.
    8. After checking the set event details, click the [Create] button.
      hadoop-vpc-use-ex12_eventRule6_vpc_en.png

    View event status

    The following describes how to view the created event's status.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.
    2. Click the Event menu.
      hadoop-vpc-use-ex12_Event_vpc_en.png

    Create notification recipient group

    The following describes how to create a notification recipient group to send event notification messages and add recipients.

    1. From the VPC environment of the NAVER Cloud Platform console, click the Services > Management & Governance > Cloud Insight (Monitoring) menus, in that order.
    2. Click the Notification Recipient menu.
    3. Click the chadoop-vpc-use-plusicon_ko button from the Target group list, enter the name of the group to create, and then click the chadoop-vpc-use-checkicon_ko button.
      hadoop-vpc-use-ex12_noti1_vpc_en.png
    4. Click All targets from the Target group list.
    5. Select targets to assign to the created group, and then click the [Assign] button.
      • To add new targets, click the [Add target] button, and add targets by referring to Cloud Insight Guide.
        hadoop-vpc-use-ex12_noti2_vpc_en.png
    6. Enter the information of the notification recipient to add, complete the identity authentication process, and then click the [Register] button.

    Cloud Hadoop Metric

    You can monitor the indicators below for all clusters created. Cloud Insight collects data for metrics at 1-minute intervals.

    Note

    If the cluster's HDFS and YARN do not operate normally, metrics are not collected and cannot be checked on the dashboard.

    IndicatorsTypeUnitExplanation
    active_nodesINTEGERnumnumber of nodes presently running MapReduce tasks or jobs
    allocated_containerINTEGERnumnumber of resource containers allocated by the ResourceManager
    allocated_mbINTEGERMBamount of memory allocated to the cluster
    allocated_v_coresINTEGERnumnumber of core nodes working
    apps_completedINTEGERnumnumber of applications submitted to YARN that have completed
    apps_failedINTEGERnumnumber of applications submitted to YARN that have failed to complete
    apps_killedINTEGERnumnumber of applications submitted to YARN that have been killed
    apps_pendingINTEGERnumnumber of applications submitted to YARN that are in a pending state
    apps_runningINTEGERnumnumber of applications submitted to YARN that are running
    apps_submittedINTEGERnumnumber of applications submitted to YARN
    available_mbINTEGERMBamount of memory available to be allocated
    capacity_remaining_gbINTEGERGBamount of remaining HDFS disk capacity
    corrupt_blocksINTEGERnumnumber of blocks that HDFS reports as corrupted
    decommissioned_nodesINTEGERnumnumber of nodes allocated to MapReduce applications that have been marked in a DECOMMISSIONED state
    hdfs_bytes_readINTEGERBytesnumber of bytes read from HDFS
    hdfs_bytes_writtenINTEGERBytesnumber of bytes written to HDFS
    hdfs_utilizationFLOAT%percentage of HDFS storage currently used
    lost_nodesINTEGERnumnumber of nodes allocated to MapReduce that have been marked in a LOST state
    missing_blocksINTEGERnumnumber of blocks in which HDFS has no replicas
    num_live_data_nodesINTEGERnumnumber of data nodes that are receiving work from Hadoop
    pending_containersINTEGERnumnumber of containers in the queue that have not yet been allocated
    pending_deletion_blocksINTEGERnumnumber of blocks marked for deletion
    pending_replication_blocksINTEGERnumstatus of block replication: blocks being replicated, age of replication requests, and unsuccessful replication requests
    pending_v_coresINTEGERnumnumber of core nodes waiting to be assigned
    rebooted_nodesINTEGERnumnumber of nodes available to MapReduce that have been rebooted and marked in a REBOOTED state
    reserved_containersINTEGERnumnumber of containers reserved
    reserved_mbINTEGERMBamount of memory reserved
    total_loadINTEGERnumtotal number of concurrent data transfers
    total_mbINTEGERMBtotal amount of memory in the cluster
    total_nodesINTEGERnumnumber of nodes presently available to MapReduce jobs
    under_replicated_blocksINTEGERnumnumber of blocks that need to be replicated one or more times
    unhealthy_nodesINTEGERnumnumber of nodes available to MapReduce jobs marked in an UNHEALTHY state
    yarn_memory_available_percentageFLOAT%percentage of remaining memory available to YARN (= available_mb / total_mb)

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.