Data Forest prerequisites

Prev Next

Available in VPC

This guide describes the information you need to know and pricing information for smooth use of Data Forest.

Data Forest components

Data Forest consists of components available to store, analyze, and visualize data. You may create and use components suitable for each purpose.

df-overview_apps_vpc_ko

Purpose of use Component
Data storage
Data access and processing
Data management
Data visualization

Data Forest application types

Applications that you can use in Data Forest are as follows:

Application Description
DEV-1.0.0
  • Plays the role of a client for all services provided in Data Forest.
  • Runs HDFS commands or submits Spark Jobs.
  • Builds client environment for HBase and Kafka.
ELASTICSEARCH-7.3.2
  • Provides Elasticsearch clusters.
  • Provides OSS version.
GRAFANA-7.5.10
  • Provides Grafana servers.
  • Can be integrated with OpenTSDB and used as a monitoring page.
HBASE-2.2.3
  • Provides Apache HBase clusters.
  • Kerberos authentication not applied.
HIVESERVER2-LDAP-3.1.0
  • Provides Apache HiveServer2.
  • Authentication provided using the LDAP method.
  • Can be used to build streaming platforms.
HUE-4.7.0
  • Apache Hue servers
  • Provides an interface where you can browse files, edit codes, and submit jobs.
KAFKA-2.4.0
  • Provides Apache Kafka clusters.
  • Can be used to build streaming platforms.
KIBANA-7.3.2
  • Provides Kibana servers.
  • Provides OSS version.
  • Can be used as a visualization tool for Elasticsearch.
OPENTSDB-2.4.1
  • Provides OpenTSDB servers.
  • Can save time series data.
  • Use HBase as storage.
PHOENIX-5.0.0
  • Provides Apache Phoenix servers.
  • Queries can be run directly with Phoenix CLI provided.
SPARK-HISTORYSERVER-3.1.2
  • Provides personal Spark History Server.
  • Can select to view only the jobs you run.
TRINO-437
  • Provides Trino servers.
  • Queries can be run directly with Trino Cli provided.
ZEEPPELIN-0.10.1
  • Provides Apache Zeppelin servers.
  • Provides an interface that enables code editing.
ZOOKEEPER-3.4.13
  • Provides Apache Zookeeper ensembles.
  • Required for running and using the HBase and Kafka apps.

Inter-application dependency information

# The following describes inter-application dependency in Data Forest:

df-version_00_vpc_ko

  • Direction of inter-application dependencies
    In Data Forest, some apps have dependencies on each other, and this impacts the sequence in which the apps should be created. Create apps referring to each app's dependency direction.

    • OpenTSDB relies on HBase, and HBase relies on Zookeeper. Therefore, the recommended order for app creation is Zookeeper > HBase > OpenTSDB.
    • Because Kafka depends on Zookeeper, it is crucial to create apps in the order of Zookeeper > Kafka.
    • Also, create apps in the order of Elasticsearch > Kibana.
  • Direction of integration between apps
    There are certain apps that can be integrated together, but this does not mean that there are any restrictions in app creation.

Application version information

The version information for applications provided by Data Forest is as follows:

Application Version
DEV 1.0.0
ELASTICSEARCH 7.3.2
GRAFANA 7.5.10
HBASE 2.2.3
HIVESERVER2-LDAP 3.1.0
HUE 4.7.0
KAFKA 2.4.0
KIBANA 7.3.2
OPENTSDB 2.4.1
PHOENIX 5.0.0
SPARK-HISTORYSERVER 3.1.2
TRINO 437
ZEPPELIN 0.10.1
ZOOKEEPER 3.4.13
Note

The supported apps' versions may change, depending on their inter-application dependencies or integration availability.

Pricing information

For more information on the usage fees of Data Forest, see the pricing information in Portal > Services > Big Data & Analytics > Data Forest.