Prerequisites for using Data Forest
    • PDF

    Prerequisites for using Data Forest

    • PDF

    Article Summary

    Available in VPC

    This document describes the information you need to know and pricing information for the smooth use of Data Forest.

    Data Forest components

    Data Forest consists of components available to store, analyze, and visualize data. Users may create and use components suitable for each purpose.
    df-overview_apps_vpc_ko

    PurposeComponent
    Data storage- HDFS
    - HBase
    - OpenTSDB
    Data access and processing- Hive
    - Spark
    - Phoenix
    - Elasticsearch
    - Kafka
    Data management- Oozie
    - Zookeeper
    Data visualization- Kibana
    - Zeppelin
    - Grafana
    - Hue

    Data Forest application types

    Applications that you can use in Data Forest are as follows:

    ApplicationDescription
    DEV-1.0.0- Plays the role of a client for all services provided in Data Forest
    - Runs HDFS commands or submits Spark Jobs
    - Builds client environment for HBase and Kafka
    ELASTICSEARCH-7.3.2- Creates Elasticsearch cluster
    - Provides OSS version
    GRAFANA-7.5.10- Provides Grafana servers
    - Can be integrated with OpenTSDB and used as a monitoring page
    HBASE-2.0.0- Provides Apache HBase clusters
    - Kerberos authentication applied
    HBASE-2.2.3- Provides Apache HBase clusters
    - Kerberos authentication not applied
    HIVESERVER2-LDAP-3.1.0- Provides Apache HiveServer2
    - Authentication provided using the LDAP method
    - Can be used to build streaming platforms
    HUE-4.7.0- Apache Hue server
    - Provides an interface where you can browse files, edit codes, and submit jobs
    KAFKA-2.4.0- Provides Apache Kafka clusters
    - Can be used to build streaming platforms
    KIBANA-7.3.2- Provides Kibana servers
    - Provides OSS version
    - Can be used as a visualization tool for Elasticsearch
    OPENTSDB-2.4.1- Provides OpenTSDB servers
    - Can save time series data
    - Use HBase as storage
    PHOENIX-5.0.0- Provides Apache Phoenix servers
    - Queries can be run directly with Phoenix CLI provided
    SPARK-HISTORYSERVER-3.1.2- Provides Spark History Server
    - Can select to see only the jobs you executed
    TRINO-367- Provides Trino servers
    - Queries can be run directly with Trino Cli provided
    ZEEPPELIN-0.10.1- Provides Apache Zeppelin servers
    - Provides an interface that enables code editing
    ZOOKEEPER-3.4.13- Provides Apache Zookeeper ensembles
    - Required for running and using the HBase and Kafka apps

    Applications that you can use in Notebooks are as follows:

    ApplicationDescription
    JUPYTERLAB- Provides JupyterLab, a web interface based on Jupyter Notebook
    - Provides Object Storage integration and runs codes for data analysis

    Inter-application dependency information

    The following describes inter-application dependency in Data Forest.

    df-version_00_vpc_ko

    • Direction of inter-application dependencies
      In Data Forest, some apps have dependencies on each other, and this impacts the sequence in which the apps should be created. Create apps referring to each app's dependency direction.

      • OpenTSDB relies on HBase, and HBase relies on Zookeeper. Therefore, the recommended order for app creation is Zookeeper > HBase > OpenTSDB.
      • Since Kafka depends on Zookeeper, it is crucial to create apps in the order of Zookeeper > Kafka.
      • Also, create apps in the order of Elasticsearch > Kibana.
    • Direction of integration between apps
      There are certain apps that can be integrated together, but this does not mean that there are any restrictions in app creation.

    Application version information

    The version information for applications provided by Data Forest is as follows:

    ApplicationVersion
    DEV1.0.0
    ELASTICSEARCH7.3.2
    GRAFANA7.5.10
    HBASE2.0.0, 2.2.3
    HIVESERVER2-LDAP3.1.0
    HUE4.7.0
    KAFKA2.4.0
    KIBANA7.3.2
    OPENTSDB2.4.1
    PHOENIX5.0.0
    SPARK-HISTORYSERVER3.1.2
    TRINO3.6.7
    ZEPPELIN0.10.1
    ZOOKEEPER3.4.13

    The version information for applications provided by Notebooks is as follows:

    ApplicationVersion
    JUPYTERLAB3.6.3
    Note

    The supported apps' version may change, depending on their inter-app dependencies or integration availability.

    Server specifications for Notebooks

    CPUMemoryDisk(HDD)
    4 vCPUs16GB50GB
    4 vCPUs32GB50GB
    8 vCPUs16GB50GB
    8 vCPUs32GB50GB
    8 vCPUs64GB50GB

    Usage fees

    For specific information regarding Data Forest usage fees, see the Portal > Service > Analytics > Data Forest menu or the Portal > Pricing menu.


    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.