Available in VPC
You can create Object Storage and copy HDFS data to Object Storage by connecting HDFS data.
Create Object Storage
To connect HDFS data, first create an Object Storage.
From the NAVER Cloud Platform console, select Object Storage and create a bucket. For more information on creating Object Storage, see Object Storage overview.
Create API authentication key
To connect with Object Storage, first create an API authentication key.
- Create an API authentication key as described in API authentication key.
- Check the Access Key ID and Secret Key.
- Used for connecting HDFS data.
Copy files to HDFS
After creating both the bucket and the API authentication key, configure the development environment using the CLI provided by Data Forest on the VM.
Once the development environment is set up, you can copy data using cp by specifying the Object Storage endpoint and authentication keys in the Hadoop command, as shown in the following example.
$ hadoop fs -Dfs.s3a.endpoint=http://kr.object.private.ncloudstorage.com -Dfs.s3a.access.key={ACCESS_KEY_ID} -Dfs.s3a.secret.key={SECRET_KEY} -Dfs.s3a.connection.ssl.enabled=false -cp hdfs://koya/user/{USERNAME}/ExampleFile s3a://{BUCKET_NAME}
Copy files to Object Storage using AWS CLI
You can use the CLI provided by AWS S3 to access Object Storage on the NAVER Cloud Platform.
For information on environment setup and commands to use the CLI, see Object Storage CLI user guide.
1. Create an app
Connect to the VM and configure authentication information using the AWS CLI commands as shown below.
$ aws configure
AWS Access Key ID [****************leLy]: ACCESS_KEY_ID
AWS Secret Access Key [None]: SECRET_KEY
Default region name [None]: [Enter]
Default output format [None]: [Enter]
2. Check your bucket information
After authentication is configured, use the CLI to view the list of buckets you created.
$ aws --endpoint-url=https://kr.object.private.ncloudstorage.com s3 ls
2020-06-24 11:09:41 bucket-1
2020-07-14 18:00:17 bucket-3
2020-09-17 19:37:36 bucket-4
2020-09-17 20:23:39 bucket-6
- When using the CLI, the
--endpoint-urloption is required. - In a VPC environment, the Object Storage endpoint-url is
kr.object.private.ncloudstorage.com.
3. Copy a single file
Use the S3 cp command to upload a file to a specific bucket.
$ aws --endpoint-url=http://kr.object.private.ncloudstorage.com s3 cp SOURCE_FILE s3://DEST_BUCKET/FILE_NAME
4. Copy large volumes of files
To synchronize the contents of a bucket with a directory, or between buckets, use the S3 sync command.
$ aws --endpoint-url=http://kr.object.private.ncloudstorage.com s3 sync SOURCE_DIR s3://DEST_BUCKET/
Use caution when using the --delete option. Files or objects that do not exist in the source may be removed.
To upload a directory and all of its subfiles to Object Storage at once, use the --recursive option of the S3 cp command.
$ aws --endpoint-url=http://kr.object.private.ncloudstorage.com s3 cp --recursive SOURCE_DIR s3://DEST_BUCKET/