Detecting objects from pedestrian datasets with PyTorch

release/20240321
English

Detecting objects from pedestrian datasets with PyTorch

Article Summary

Share feedback

Thanks for sharing your feedback!

Available in VPC

This guide explains how to write a program that detects pedestrian objects with PyTorch and submit the DL app as single batch jobs.

Step 1. Create account and app

For how to create Data Forest account, refer to Create and manage accounts.
For how to create apps, refer to Create and manage apps.

Step 2. Download dataset

In this example, data used in the study Object Detection Combining Recognition and Segmentation. Liming Wang, Jianbo Shi, Gang Song, I-fan Shen. To Appear in ACCV 2007 is used for model learning.

Download the data set from Download data set.

Step 3. Upload dataset to HDFS

The following describes how to upload data sets to the user's HDFS.

Log in to the HUE app.
- Log in with the Data Forest account name and password.
Click the [Upload] button at the upper right corner of the page.
Click the [Select Files] button.
Upload files.
Check if the files are uploaded to the HDFS.

Step 4. Create workspace

The following describes how to create a workspace.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus, in that order.
Click AI Forest > Workspace > [Create workspace] > [Advanced workspace].
Select a Data Forest account, set the workspace name, and then select "Singlebatch" for the workspace type.
Select PyTorch for the Docker image, followed by v1.7 for the image version.

Note

PyTorch is an open source machine learning library for Python programs. For more information, refer to the PyTorch website.

Select the GPU model name, number of GPU cores, and memory capacity.
We'll proceed with the default values in this example.
Enter the information in the data settings area, and then click the [Add] button.
- Enter
  - InputPath: Enter the path of input data to be copied into the container, enter '/user/{username}/data_in'
  - Input Container Local Path: Enter container path to store input data
- Output
  - OutputPath: Enter HDFS path to store output, enter '/user/{username}/data_out'
  - Output Container Local Path: Enter the path of the container where the output data resides
  - Overwrite: Set whether to overwrite if a file already exists when storing output data in HDFS
Click the [Next] button. The workspace creation is completed.

Step 5. Download example code

The following is the example code required for running the example.

Version	File
Pytorch v1.7	od-torch1.zip

Import a pre-trained model with the coco2017 data set. Import the model with parameters pre-trained, and fine tune it with the data set downloaded from run.sh.

cocodataset2017 is a data set that can be used for segmentation, captioning, key point extraction, etc. in addition to object detection. It is an object data set of a total of 330,000 images, which includes 220,000 labeled objects that can be classified into 80 categories.
Fine tuning refers to retrain the pre-learned weights with a data set that suits the purpose.

The following is a part of code that trains the model.

...
dataset = PennFudanDataset('PennFudanPed', get_transform(train=True))
    dataset_test = PennFudanDataset('PennFudanPed', get_transform(train=False))
    
    # split the dataset in train and test set
    indices = torch.randperm(len(dataset)).tolist()
    dataset = torch.utils.data.Subset(dataset, indices[:-50])
    dataset_test = torch.utils.data.Subset(dataset_test, indices[-50:])

    # define training and validation data loaders
    data_loader = torch.utils.data.DataLoader(
        dataset, batch_size=2, shuffle=True, num_workers=1,
        collate_fn=utils.collate_fn)

    data_loader_test = torch.utils.data.DataLoader(
        dataset_test, batch_size=1, shuffle=False, num_workers=1,
        collate_fn=utils.collate_fn)

    # get the model using our helper function
    model = get_model_instance_segmentation(num_classes)

    # move model to the right device
    model.to(device)
    print(device)
    # construct an optimizer
    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = torch.optim.SGD(params, lr=0.005,
                                momentum=0.9, weight_decay=0.0005)

    lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                                   step_size=3,
                                                   gamma=0.1)


    num_epochs = FLAGS.max_steps

    for epoch in range(num_epochs):
        train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=5)
        lr_scheduler.step()
        # evaluate on the test dataset
        evaluate(model, data_loader_test, device=device)

    print("done training")
    torch.save(model.state_dict(), FLAGS.log_dir+'/model.pth')
    print("saved model")
    ...

Step 6. Upload to workspace browser

Decompress the downloaded example file and upload it to the workspace browser.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus, in that order.
Click AI Forest > Workspace browser.
Select the account and workspace, and then click the [Upload] button.
When the upload window appears, drag the decompressed files from "od-torch1.zip" to the upload window.
Click the [Start upload] button.
Click the [OK] button when the upload is completed.

Step 7. Submit DL app as single batch job

The following describes how to submit a DL app as a single batch job.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > AI Forest > Workspace browser menus, in that order.
Select an account, and then a workspace.
Select the checkbox of "run.sh," and then click the [Run] button.
Enter the following information.
Click the [OK] button. The DL app will run.

Step 8. Check log and result of DL app

The following describes how to view the DL app's execution logs and results after running it.

From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > App menus, in that order.
Select an account, click the app whose details you want to view.
Access the URL under Quick links > AppMaster UI in the app's details.
When the login window appears, enter the account name and password you entered when creating the Data Forest account.
In the Applications menu, find and click the ID executed by the app name that was entered when running the DL app.
Click Logs of the application ID. You can check the logs of the executed app.
Check the DL app's result from the {path you entered in the output HDFS path}/{value delivered as --log_dir argument}.

Was this article helpful?

What's Next

Linking Container Registry

Table of contents

Step 1. Create account and app
Step 2. Download dataset
Step 3. Upload dataset to HDFS
Step 4. Create workspace
Step 5. Download example code
Step 6. Upload to workspace browser
Step 7. Submit DL app as single batch job
Step 8. Check log and result of DL app