Create and manage account

Prev Next

Available in VPC

This guide describes how to create and manage a Data Forest account and app.

Create account

To create a Data Forest account:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Accounts > [Create account].
  3. Enter the account name and account password.
    • Account name: Enter the account name required for app submission. (A combination of English letters and numbers between 2 and 16 characters can be used. The account name must be a unique value within the cluster. Click [Check for duplicates].)
    • Account password: Enter the password used for account login. (It must be between 8 and 20 characters, and contain at least 1 English lower or uppercase letter, special character, and number.)
  4. Click [Create].
  5. Check the account list to see if the account has been created.
    • It takes about 2 to 3 minutes to create an account.
    • Once the account has been created successfully, the account status changes to Running.
  6. Check if the account status is Running.
    df-account_03_vpc_ko
Note

In Data Forest, only 1 account can be created and used per user.

Account management

Download keytab

To access the cluster with the account, a keytab file is required. To download the keytab file:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest > Accounts.
  2. Select an account, and then click Cluster access information > Download Kerberos keytab.
  3. When the download window appears, click the [Download].
    df-account_05_vpc_ko

Change HDFS quota

You can change the number and capacity of files for each HDFS according to the namespace of the cluster used by the account.

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest > Accounts.
  2. Select the account whose HDFS Quota you want to change and click Change account settings > Change HDFS Quota.
  3. When the Change HDFS Quota window appears, change the information and click the [Change].
    df-account_06_vpc_ko
    • Namespace: Select a namespace.
    • Number of files to change: Select from 1 million to 5 million files, in increments of 1 million. (Default: 1 million)
    • File capacity to change: Select from 200 TB to 500 TB, in increments of 100 TB. (Default: 200 TB)

Initialize Kerberos keytab

You can initialize the Kerberos keytab if you have lost the downloaded keytab or need to change it.

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest > Accounts.
  2. Select the account to change the keytab for, and then click Change account settings > Initialize Kerberos keytab.
  3. When the initialize Kerberos keytab window appears, click the [Initialize] after checking the information.
Caution

Take caution because all applications and batch jobs using the existing keytab will fail if you initialize the Kerberos keytab.

Initialize account password

You can change the password if you have lost the account password or need to change it.
To initialize your account password, follow these steps:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest > Accounts.
  2. Select the account to initialize the account password for, and then click Change account settings > Initialize account password.
  3. When the Initialize account password window appears, enter the new password and click the [Change].

Delete account

You can delete accounts that are not in use. To delete an account, follow these steps:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest > Accounts.
  2. Select the account to delete, and then click [Delete].
  3. Enter the name of the account to delete in the Delete account window, and then click [Delete].
Caution
  • The account can't be deleted if there are existing apps on the account. Delete the apps before deleting the account.
  • Deleting an account permanently deletes all data and files stored in the account’s HDFS. This action cannot be undone.

Authenticate account

There are 2 methods for user authentication in Data Forest. This section describes how to use account name and password authentication, as well as Kerberos principal and keytab authentication.

Use account name and password

Web UI - SSO

Data Forest uses the kr.df.naverncp.com domain. Due to the maintenance of login information through cookie-based information, you are not required to perform separate login procedures when accessing the same domain. If you have logged in to koya-nn1.kr.df.naverncp.com, you can also access rm1.kr.df.naverncp.com directly. The SSO works based on HTTP cookies, so SSO is not applicable if the domain is different.

To log in on the web UI:

  1. Access the web server that supports web SSO.
  2. When the login window appears, enter the account name and password you specified when creating Data Forest in the Username and Password fields, and then click [LOGIN].
    df-auth_01_vpc_ko
Note

If the account name and password do not match, a popup appears requesting the ID and password. Do not enter any information in the popup. Click [Cancel], then try logging in again from the previous login UI.
df-auth_02_vpc_ko

The authentication process uses a session cookie. The login session is maintained for up to 10 hours. If you haven't closed the browser, then it would request you to log in again after 10 hours. There's no separate logout feature. You can log out by closing down the browser completely or deleting the cookie. (Delete the cookies hadoop.auth and hadoop-jwt for the kr.df.naverncp.com domain.)

To log out of the web UI:

  1. If you are using the Chrome browser, click [More tools] > [Developer tools].
    df-auth_04_vpc_ko
  2. Go to the [Application] tab, and then click [Storage] > [Cookies].
  3. Click the cookies in the kr.df.naverncp.com domain.
    df-auth_05_vpc_ko
  4. Right-click hadoop.auth and hadoop-jwt, and then click [Delete]. The cookies are now deleted.
    df-auth_06_vpc_ko

HTTP API - Basic

Use the -u option to specify username:password in the HTTP API that uses the basic authentication.

$ curl -s -u example  "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1?op=GETHOMEDIRECTORY" | python -m json.tool
Enter host password for user 'example':
{
    "Path": "/user/example"
}

Alternatively, you can give it the Authorization: Basic $ENCODED_STRING header to authenticate without the need for the -u option. ENCODED_STRING is the encoded value of username:password in the Base64 format.

 curl -s -H "Authorization: Basic ZXhhbXBsZTohQFF3ZXJ0MTI=" "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1?op=GETHOMEDIRECTORY" | python -m json.tool
{
    "Path": "/user/example"
}

Use Kerberos principal and keytab

Kerberos principal indicates a unique identity in a Kerberos system. The keytab file has the encrypted key to be used in the symmetric key algorithm that corresponds to Kerberos principal, and acts as a password. Kerberos authentication requires additional configuration for the host you are going to use. Also, the configuration methods differ by operating system.

Caution

The Kerberos keytab file must be managed so it is not accessible by people other than the personnel in charge. If the Kerberos keytab file is leaked, then a third party with the file can use the principal permission specified in the keytab. Take caution not to upload keytab files to a public server or private source storage where multiple people have access.

Authenticate keytab and check details

You can use the kinit command to authenticate uploaded keys.

Note

For more information on how to upload keys to HDFS, see Using Dev app > Kerberos authentication.

CentOS7

$ kinit example -kt df.example.keytab
$ klist -5
Ticket cache: FILE:/tmp/krb5cc_p46655
Default principal: example@KR.DF.NAVERNCP.COM

Valid starting       Expires              Service principal
04/05/2021 18:02:22  04/06/2021 18:02:22  krbtgt/KR.DF.NAVERNCP.COM@KR.DF.NAVERNCP.COM
        renew until 04/12/2021 18:02:22

Use the klist command to view the keytab file's content.

$ klist -kte df.example.keytab
Keytab name: FILE:df.example.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   4 12/28/2020 11:13:57 example@KR.DF.NAVERNCP.COM (aes256-cts-hmac-sha1-96)
   4 12/28/2020 11:13:57 example@KR.DF.NAVERNCP.COM (aes128-cts-hmac-sha1-96)
   4 12/28/2020 11:13:57 example@KR.DF.NAVERNCP.COM (des3-cbc-sha1)

macOS

$ kinit --keytab=df.example.keytab example
$ klist -5
Credentials cache: API:B94B9BE6-0510-4621-B6B9-E48F30488DAC
        Principal: example@KR.DF.NAVERNCP.COM

  Issued                Expires               Principal
Apr  5 18:11:06 2021  Apr  6 18:11:06 2021  krbtgt/KR.DF.NAVERNCP.COM@KR.DF.NAVERNCP.COM
$ ktutil --keytab=df.example.keytab list
df.example.keytab:

Vno  Type                     Principal                   Aliases
  4  aes256-cts-hmac-sha1-96  example@KR.DF.NAVERNCP.COM
  4  aes128-cts-hmac-sha1-96  example@KR.DF.NAVERNCP.COM
  4  des3-cbc-sha1            example@KR.DF.NAVERNCP.COM

HTTP API - SPNEGO

HTTP API uses SPNEGO. HTTP API requires Kerberos authentication first before calling it. You can use SPNEGO easily with cURL after the Kerberos authentication.

Caution

wget does not support SPNEGO.

To run kinit first before using the -u : --negotiate option, see Authenticate keytab and check details.

$ kinit example -kt df.example.keytab
$ curl -u : --negotiate "https://sso.kr.df.naverncp.com/gateway/koya-auth-kerb/webhdfs/v1/user/example?op=GETHOMEDIRECTORY" | python -m json.tool
{
    "Path": "/user/example"
}

Alternatively, you can give it the Authorization: Negotiate $ENCODED_STRING header to authenticate without the need for the -u option. ENCODED_STRING is the encoded value of gssapi-data in the Base64 format.

 curl -s -H "Authorization: Negotiate ZXhhbXBsZTohQFF3ZXJ0MTI=" "https://sso.kr.df.naverncp.com/gateway/koya-auth-kerb/webhdfs/v1/user/example?op=GETHOMEDIRECTORY" | python -m json.tool
{
    "Path": "/user/example"
}
Note

cURL may not support SPNEGO depending on the build version. Check the build version and make sure GSS-Negotiate and SPNEGO are included in Features.

$ curl --version
curl 7.29.0 (x86_64-redhat-linux-gnu) libcurl/7.29.0 NSS/3.44 zlib/1.2.7 libidn/1.28 libssh2/1.8.0
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz unix-sockets

Check HTTP API authentication method

If it is difficult to determine which authentication method to use from just looking at the endpoint, then you can try calling a request without the authentication information. You can check the authentication method according to the header value of WWW-Authenticate.

WWW-Authenticate Authentication method
Basic Basic Auth
Negotiate SPENGO

Basic Auth

$ curl -i "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1?op=GETHOMEDIRECTORY"
HTTP/1.1 401 Unauthorized
Date: Mon, 05 Apr 2021 09:39:53 GMT
Server: Jetty(9.4.12.v20180830)
WWW-Authenticate: BASIC realm="application"
Content-Length: 0
Set-Cookie: ROUTEID=.1; path=/

SPENGO

$ curl -i "https://sso.kr.df.naverncp.com/gateway/koya-auth-kerb/webhdfs/v1/user/example?op=GETFILESTATUS"
HTTP/1.1 401 Authentication required
Date: Mon, 05 Apr 2021 09:39:04 GMT
Server: Jetty(9.4.12.v20180830)
WWW-Authenticate: Negotiate
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 391
Set-Cookie: hadoop.auth=; Path=gateway/koya-auth-kerb; Domain=sso.kr.df.naverncp.com; Secure; HttpOnly
Set-Cookie: ROUTEID=.2; path=/