使用 Monitoring 控制台
  • PDF

使用 Monitoring 控制台

  • PDF

可在VPC环境下使用。

Monitoring提供两种仪表盘,可方便用户查看与Cloud Hadoop性能和记录有关的各种监控信息。Monitoring服务包含在NAVER Cloud Platform的Cloud Hadoop中,无需额外付费即可使用。

Monitoring提供的仪表盘种类如下:

  • HADOOP Dashboard: 与正在运行的Cloud Hadoop有关的监控信息
  • OS Dashboard: 正在运行的Cloud Hadoop各服务器的硬件和网络信息

通过这两种仪表盘,用户可以查看最近两个月内的Cloud Hadoop相关信息以及各服务器的硬件和网络指标。各仪表盘由图表构成,用户可以输出特定的图表,也可以将其作为多种格式的文件下载到本地PC,从而更有效地开展业务。

参考

可以设置为在监控结果的特定指标超过阈值或满足特定条件时识别为事件并向用户发送通知。关于事件和通知设置的具体使用方法,请参考通过Cloud Insight监控Cloud Hadoop

启动Monitoring

  1. 在NAVER Cloud Platform控制台依次点击Services > Big Data & Analytics > Cloud Hadoop菜单。
  2. 点击 [创建集群] 按钮,然后创建Cloud Hadoop集群。
  3. 点击左侧的Cloud Hadoop > Monitoring菜单。
  4. 在Cloud Hadoop集群列表上点击要监控的集群。

Monitoring界面

关于Monitoring的基本使用说明如下:

chadoop-vpc-monitoring1_ko

  • 可以从左侧区域选择当前正在运行的Cloud Hadoop集群以及各集群的服务器。
  • 如果点击集群名称,右侧区域会显示HADOOP Dashboard;如果点击集群名称下方的服务器,则会显示OS Dashboard。

查看Monitoring仪表盘

Monitoring提供的仪表盘由多个图表构成。用户可以按集群在想要查看的仪表盘中显示并直观地查看所需的信息。仪表盘的使用方法如下:

HADOOP Dashboard

chadoop-vpc-monitoring2_ko

  • 在左侧Cloud Hadoop集群列表中点击所需集群即可显示如右侧所示的HADOOP Dashboard。
    • HADOOP Dashboard上的数据是以分钟为单位收集的。
    • 监控信息以平均值为准,查询周期将根据所选期间的类型而变化。
  • 各群组中可查看的指标具体如下:
群体 指标名 单位 说明
Apps apps_completed num number of applications submitted to YARN that have completed
apps_failed num number of applications submitted to YARN that have failed to complete
apps_killed num number of applications submitted to YARN that have been killed
apps_pending num number of applications submitted to YARN that are in a pending state
apps_running num number of applications submitted to YARN that are running
apps_submitted num number of applications submitted to YARN
Blocks corrupt_blocks num number of blocks that HDFS reports as corrupted
missing_blocks num number of blocks in which HDFS has no replicas
pending_deletion_blocks num number of blocks marked for deletion
pending_replication_blocks num status of block replication: blocks being replicated, age of replication requests, and unsuccessful replication requests
under_replicated_blocks num number of blocks that need to be replicated one or more times
Containers allocated_container num number of resource containers allocated by the ResourceManager
pending_containers num number of containers in the queue that have not yet been allocated
reserved_containers num number of containers reserved
HDFS capacity(GB) capacity_remaining_gb GB amount of remaining HDFS disk capacity
HDFS read/write(bytes) hdfs_bytes_read num number of bytes read from HDFS
hdfs_bytes_written num number of bytes written to HDFS
HDFS utilization(%) hdfs_utilization % percentage of HDFS storage currently used
Memory(MB) allocated_mb MB amount of memory allocated to the cluster
available_mb MB amount of memory available to be allocated
reserved_mb MB amount of memory reserved
total_mb MB total amount of memory in the cluster
Nodes num_live_data_nodes num number of data nodes that are receiving work from Hadoop
unhealthy_nodes num number of nodes available to MapReduce jobs marked in an UNHEALTHY state
active_nodes num number of nodes presently running MapReduce tasks or jobs
decommissioned_nodes num number of nodes allocated to MapReduce applications that have been marked in a DECOMMISSIONED state
lost_nodes num number of nodes allocated to MapReduce that have been marked in a LOST state
rebooted_nodes num number of nodes available to MapReduce that have been rebooted and marked in a REBOOTED state
total_nodes num number of nodes presently available to MapReduce jobs
V_cores allocated_v_cores num number of core nodes working
pending_v_cores num number of core nodes waiting to be assigned
Data transfers total_load num total number of concurrent data transfers
YARN memory(%) yarn_memory_available_percentage % percentage of remaining memory available to YARN (= available_mb / total_mb)
  • 可以实时监控集群的指标变化。
    • 下图所示的是集群数据节点数量减少时指标变化的样子。
      chadoop-vpc-monitoring3_ko
  • 用户可以手动将鼠标光标悬停在图表上以放大或缩小图表,也可以指定查询期间后在仪表盘上查看所需期间的指标。
    chadoop-vpc-monitoring4_ko
  • 如果点击 chadoop-vpc-monitoring-icon_en 如图您可以打印图表并下载各种文件扩展名的图表。请选择所需的格式以下载数据。
    chadoop-vpc-monitoring5_ko

OS Dashboard

chadoop-vpc-monitoring6_ko

  • 请在监控页面上选择集群下方的服务器而非集群名称。此时可以查看OS Dashboard。
    • OS Dashboard上的数据是以分钟为单位收集的。
    • 监控信息以平均值为准,查询周期将根据所选期间的类型而变化。
  • 可以查看组成Cloud Hadoop集群的主节点、边缘节点和数据节点,也可以分别查看这些节点的CPU Usage、LoadAverage、Memory、Disk I/O、Disk usage和Network I/O等指标。

本文是否有帮助