Detecting objects from pedestrian datasets with PyTorch
    • PDF

    Detecting objects from pedestrian datasets with PyTorch

    • PDF

    Article Summary

    Available in VPC

    This guide explains how to write a program that detects pedestrian objects with PyTorch and submit the DL app as single batch jobs.

    Step 1. Create account and app

    Step 2. Download dataset

    In this example, data used in the study Object Detection Combining Recognition and Segmentation. Liming Wang, Jianbo Shi, Gang Song, I-fan Shen. To Appear in ACCV 2007 is used for model learning.

    Download the data set from Download data set.

    Step 3. Upload dataset to HDFS

    The following describes how to upload data sets to the user's HDFS.

    1. Log in to the HUE app.
      • Log in with the Data Forest account name and password.
    2. Click the [Upload] button at the upper right corner of the page.
    3. Click the [Select Files] button.
    4. Upload files.
    5. Check if the files are uploaded to the HDFS.
      af-torch_hue_vpc_ko.png

    Step 4. Create workspace

    The following describes how to create a workspace.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus, in that order.
    2. Click AI Forest > Workspace > [Create workspace] > [Advanced workspace].
    3. Select a Data Forest account, set the workspace name, and then select "Singlebatch" for the workspace type.
    4. Select PyTorch for the Docker image, followed by v1.7 for the image version.
      df-af-coco_1-4_vpc_en
    Note

    PyTorch is an open source machine learning library for Python programs. For more information, refer to the PyTorch website.

    1. Select the GPU model name, number of GPU cores, and memory capacity.
      We'll proceed with the default values in this example.
      df-af-mnist_1-5_vpc_ko
    2. Enter the information in the data settings area, and then click the [Add] button.
      • Enter
        • InputPath: Enter the path of input data to be copied into the container, enter '/user/{username}/data_in'
        • Input Container Local Path: Enter container path to store input data
      • Output
        • OutputPath: Enter HDFS path to store output, enter '/user/{username}/data_out'
        • Output Container Local Path: Enter the path of the container where the output data resides
        • Overwrite: Set whether to overwrite if a file already exists when storing output data in HDFS
          df-af-mnist_1-6_vpc_ko
    3. Click the [Next] button. The workspace creation is completed.

    Step 5. Download example code

    The following is the example code required for running the example.

    VersionFile
    Pytorch v1.7od-torch1.zip

    Import a pre-trained model with the coco2017 data set. Import the model with parameters pre-trained, and fine tune it with the data set downloaded from run.sh.

    • cocodataset2017 is a data set that can be used for segmentation, captioning, key point extraction, etc. in addition to object detection. It is an object data set of a total of 330,000 images, which includes 220,000 labeled objects that can be classified into 80 categories.
    • Fine tuning refers to retrain the pre-learned weights with a data set that suits the purpose.

    The following is a part of code that trains the model.

    ...
    dataset = PennFudanDataset('PennFudanPed', get_transform(train=True))
        dataset_test = PennFudanDataset('PennFudanPed', get_transform(train=False))
        
        # split the dataset in train and test set
        indices = torch.randperm(len(dataset)).tolist()
        dataset = torch.utils.data.Subset(dataset, indices[:-50])
        dataset_test = torch.utils.data.Subset(dataset_test, indices[-50:])
    
        # define training and validation data loaders
        data_loader = torch.utils.data.DataLoader(
            dataset, batch_size=2, shuffle=True, num_workers=1,
            collate_fn=utils.collate_fn)
    
        data_loader_test = torch.utils.data.DataLoader(
            dataset_test, batch_size=1, shuffle=False, num_workers=1,
            collate_fn=utils.collate_fn)
    
        # get the model using our helper function
        model = get_model_instance_segmentation(num_classes)
    
        # move model to the right device
        model.to(device)
        print(device)
        # construct an optimizer
        params = [p for p in model.parameters() if p.requires_grad]
        optimizer = torch.optim.SGD(params, lr=0.005,
                                    momentum=0.9, weight_decay=0.0005)
    
        lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                                       step_size=3,
                                                       gamma=0.1)
    
    
        num_epochs = FLAGS.max_steps
    
        for epoch in range(num_epochs):
            train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=5)
            lr_scheduler.step()
            # evaluate on the test dataset
            evaluate(model, data_loader_test, device=device)
    
        print("done training")
        torch.save(model.state_dict(), FLAGS.log_dir+'/model.pth')
        print("saved model")
        ...
    

    Step 6. Upload to workspace browser

    Decompress the downloaded example file and upload it to the workspace browser.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus, in that order.
    2. Click AI Forest > Workspace browser.
    3. Select the account and workspace, and then click the [Upload] button.
      af-torch_04_vpc_en.png
    4. When the upload window appears, drag the decompressed files from "od-torch1.zip" to the upload window.
    5. Click the [Start upload] button.
    6. Click the [OK] button when the upload is completed.

    Step 7. Submit DL app as single batch job

    The following describes how to submit a DL app as a single batch job.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > AI Forest > Workspace browser menus, in that order.
    2. Select an account, and then a workspace.
    3. Select the checkbox of "run.sh," and then click the [Run] button.
      af-torch_03_vpc_en
    4. Enter the following information.
      af-torch_05_vpc_en
    5. Click the [OK] button. The DL app will run.

    Step 8. Check log and result of DL app

    The following describes how to view the DL app's execution logs and results after running it.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > App menus, in that order.
    2. Select an account, click the app whose details you want to view.
    3. Access the URL under Quick links > AppMaster UI in the app's details.
      df-af-coco_appmaster_vpc_ko
    4. When the login window appears, enter the account name and password you entered when creating the Data Forest account.
    5. In the Applications menu, find and click the ID executed by the app name that was entered when running the DL app.
      df-af-coco_appid_vpc_ko
    6. Click Logs of the application ID. You can check the logs of the executed app.
      df-qs_logs_vpc_ko.png
    7. Check the DL app's result from the {path you entered in the output HDFS path}/{value delivered as --log_dir argument}.
      df-af-coco_result_vpc_ko.png

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.