Data Forest prerequisites

Available in VPC

This guide describes the information you need to know and pricing information for smooth use of Data Forest.

Data Forest components

Data Forest consists of components available to store, analyze, and visualize data. You may create and use components suitable for each purpose.

df-overview_apps_vpc_ko

Purpose of use	Component
Data storage	HDFS HBase OpenTSDB
Data access and processing	Hive Spark Phoenix Elasticsearch Kafka
Data management	Oozie Zookeeper
Data visualization	Kibana Zeppelin Grafana Hue

Applications that you can use in Data Forest are as follows:

Application	Description
DEV-1.0.0	Plays the role of a client for all services provided in Data Forest. Runs HDFS commands or submits Spark Jobs. Builds client environment for HBase and Kafka.
ELASTICSEARCH-7.3.2	Provides Elasticsearch clusters. Provides OSS version.
GRAFANA-7.5.10	Provides Grafana servers. Can be integrated with OpenTSDB and used as a monitoring page.
HBASE-2.2.3	Provides Apache HBase clusters. Kerberos authentication not applied.
HIVESERVER2-LDAP-3.1.0	Provides Apache HiveServer2. Authentication provided using the LDAP method. Can be used to build streaming platforms.
HUE-4.7.0	Apache Hue servers Provides an interface where you can browse files, edit codes, and submit jobs.
KAFKA-2.4.0	Provides Apache Kafka clusters. Can be used to build streaming platforms.
KIBANA-7.3.2	Provides Kibana servers. Provides OSS version. Can be used as a visualization tool for Elasticsearch.
OPENTSDB-2.4.1	Provides OpenTSDB servers. Can save time series data. Use HBase as storage.
PHOENIX-5.0.0	Provides Apache Phoenix servers. Queries can be run directly with Phoenix CLI provided.
SPARK-HISTORYSERVER-3.1.2	Provides personal Spark History Server. Can select to view only the jobs you run.
TRINO-437	Provides Trino servers. Queries can be run directly with Trino Cli provided.
ZEEPPELIN-0.10.1	Provides Apache Zeppelin servers. Provides an interface that enables code editing.
ZOOKEEPER-3.4.13	Provides Apache Zookeeper ensembles. Required for running and using the HBase and Kafka apps.

# The following describes inter-application dependency in Data Forest:

df-version_00_vpc_ko

Direction of inter-application dependencies
In Data Forest, some apps have dependencies on each other, and this impacts the sequence in which the apps should be created. Create apps referring to each app's dependency direction.
- OpenTSDB relies on HBase, and HBase relies on Zookeeper. Therefore, the recommended order for app creation is Zookeeper > HBase > OpenTSDB.
- Because Kafka depends on Zookeeper, it is crucial to create apps in the order of Zookeeper > Kafka.
- Also, create apps in the order of Elasticsearch > Kibana.
Direction of integration between apps
There are certain apps that can be integrated together, but this does not mean that there are any restrictions in app creation.

The version information for applications provided by Data Forest is as follows:

Note

The supported apps' versions may change, depending on their inter-application dependencies or integration availability.

For more information on the usage fees of Data Forest, see the pricing information in Portal > Services > Big Data & Analytics > Data Forest.