Secure Hadoop configuration (optional)
    • PDF

    Secure Hadoop configuration (optional)

    • PDF

    Article Summary

    Available in VPC

    Kerberos integrates with the Hadoop cluster to provide strong authentication for users and services.

    This guide describes how to configure the authentication system installed in Cloud Hadoop for Secure Hadoop configuration.

    Note

    Before configuring Secure Hadoop, check whether the cluster is integrated with the Data Catalog service. If the cluster is integrated with external Hive Metastore through Data Catalog, some services may not function properly when Kerberos is applied to the cluster.

    Configuration

    Cluster administrators can manage users and groups, as well as user authentication and permission management through Kerberos, allowing for detailed authentication configurations in Cloud Hadoop.

    Multi-Master configuration

    • To maintain service continuity, LDAP and Kerberos services are configured for redundancy by default and are installed on two Cloud Hadoop master nodes.
    • On the master nodes, slapd, krb5kdc, kadmin daemons run for authentication services.
    Master 1Master 2
    LDAP (slapd)LDAP (slapd)
    Kerberos (krb5kdc / kadmin)Kerberos (krb5kdc / kadmin)

    Authentication workflow

    Cloud Hadoop is designed to authenticate via Kerberos. It is configured with Kerberos and LDAP authentication systems, requiring both users and services to be authenticated by the system.

    chadoop-3-7-00_ko

    Every node's Hadoop service possesses a Kerberos principal for authentication. Services have a keytab file stored on the server, which includes a randomized password in the keytab file. Typically, users must obtain a Kerberos ticket via kinit command to interact with the service.

    Kerberos principal

    In Kerberos, users are referred to as principals. A Hadoop deployment environment consists of user principals and service principals. User principals are usually synchronized with KDC (Kerberos Distribution Center). One user principal represents one real user. Service principals vary by server and service, meaning each service on a server has its own unique principal.

    Keytab files

    Keytab files contain Kerberos principals and keys. They allow users and services to authenticate to Hadoop services using keytabs without having to use interactive tools or enter passwords. Hadoop generates a service principal for each node's service. These principals are stored in a keytab file on the Hadoop node.

    Preparations for Kerberize

    • Ensure that the ambari-agent is running on all nodes within the cluster, including the ambari-server.
    • All nodes managed by Ambari (except for two master servers) should have the krb5-workstation package installed. During the 2. Configure Kerberos step under Advanced kerberos-env settings, make sure to uncheck "Install OS-specific Kerberos client package(s)" before proceeding with Kerberize.
      chadoop-3-7-03-02-0_ko
    • Kerberize requires a complete shutdown of the cluster. Proceed with this preferably before operation of the cluster. (Recommended)

    Ambari Kerberize configuration

    Start Kerberos configuration

    1. Access the Ambari UI and follow the sequence by clicking on Cluster Admin > Kerberos in order in the bottom left corner.
    2. Click the [ENABLE KERBEROS] button.
      chadoop-3-7-02-01_ko
    3. After reviewing the content in the warning pop-up window, click the [PROCEED ANYWAY] button.
      chadoop-3-7-02-02_ko

    Kerberos configuration wizard

    1. Get Started

    What type of KDC do you plan on using? Select Existing MIT KDC from the option. Then, select all three checkboxes below and click the [Next] button.
    chadoop-3-7-03-01_ko

    2. Configure Kerberos

    1. Configure the following items and click the [Next] button.

    KDC

    • KDC hosts: enter the host name (FQDN) of the two master nodes where KDC is installed, using a comma (,) as the separator.
    • Realm name: enter the Realm set during the Cloud Hadoop installation.
    • Click the Test KDC Connection button to test the connectivity.

    Kadmin

    • Kadmin host: you only have to enter the host name (FQDN) of one master node. If unsure which master node's FQDN to enter, on the master node, enter kadmin -p admin/admin -q "listprincs", then enter kadmin/FQDN@REALM of FQDN.
    • Admin principal: enter admin/admin.
    • Select the Save Admin Credentials checkbox.
      chadoop-3-7-03-02-01_ko
    Caution

    It is crucial to select the Save Admin Credentials checkbox. If not selected, there might be some limitations when using Cloud Hadoop service.

    Advanced kerberos-env

    • Proceed with Kerberize after unchecking the Install OS-specific Kerberos client package(s).
      chadoop-3-7-03-02-0_ko
    • Change the Encryption Types to aes256-cts aes128-cts.
    • Add +requires_preauth to Principal Attributes.
      chadoop-3-7-03-02-1_ko

    Advanced krb5.conf

    • If Kerberos details were set to be used during the creation of Cloud Hadoop, it is essential to uncheck the Manage Kerberos client krb5.conf checkbox. Uncheck the checkbox, then click the [NEXT] button.
      chadoop-3-7-03-02-2_ko

    3. Install and Test Kerberos Client

    Once the Kerberos configuration task is completed, Install Kerberos Client and Test Kerberos Client will start automatically.
    Installation is complete when the message, Kerberos service has been installed and tested successfully appears on the screen. When the installation is complete, click the [Next] button.

    chadoop-3-7-03-03_ko

    Note

    In case of an Admin session expiration error, enter the following and click the [SAVE] button.

    • Admin principal : admin/admin
    • Admin password: the KDC admin account password configured during cluster creation
    • Select the Save Admin Credentials checkbox

    If you enter as above and the error persists, verify if there has been a change to the KDC admin account password.

    4. Configure Identities

    This step involves configuring the service users and Hadoop services' principals and keytab locations.
    Check the list of settings that are automatically added by the Ambari Wizard and click the [Next] button.
    chadoop-3-7-03-04_ko

    5. Confirm Configuration

    After checking the configuration, click the [Next] button.
    chadoop-3-7-03-05_ko

    6. Stop Services

    Once the configuration information is verified, the cluster shutdown process will automatically begin. Click the [Next] button once the shutdown is complete.
    chadoop-3-7-03-06_ko

    7. Kerberize Cluster

    The process consists of 7 sequential steps. Click the [Next] button upon completion.
    chadoop-3-7-03-07_ko

    8. Start and Test Services

    This step involves starting and verifying the Hadoop service. Click the [Complete] button upon completion.
    chadoop-3-7-03-08_ko

    9. Check Admin - Kerberos Enabled status

    Once the message, Kerberos security is enabled is displayed on the screen, the cluster has successfully completed the Kerberize task.
    chadoop-3-7-04-1_ko

    Check Kerberize application

    To check if Kerberize has been applied, verify the Hadoop service principal and test by executing hadoop fs commands.
    The following example assumes that Kerberos information was set to be used during the creation of Cloud Hadoop. (ex. Realm : NAVERCORP.COM)

    1. After completing the Ambari Kerberize configuration, execute it using the kadmin -p admin/admin -q "listprincs" command.

      • You can see that the Hadoop service principal has been created, and an error occurs when you run the hadoop fs command.

      chadoop-3-7-05-02_ko

      Note

      If Kerberize has not been applied in Ambari, executing the kadmin -p admin/admin -q "listprincs" command on the master node will display as the following: You can execute the hadoop fs command with the sshuser default account to see the results without checking permissions as follows:

      chadoop-3-7-05-01_ko

    2. Obtain the admin account ticket using the kinit command and execute the hadoop fs command again.

      • The result values should be displayed correctly.
      • Deleting the ticket with the kdestroy command and re-executing the hadoop fs command will confirm that an error occurs.

      chadoop-3-7-05-03_ko


    Was this article helpful?

    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.