Note: This walkthrough does not go into detailed parameter tuning.

For multi-GPU training, replace python tools/trainxxxx with:

python -m paddle.distributed.launch --selected_gpus 0,1,2,3 tools/trainxxxxxx

The 0,1,2,3 setting assumes four GPUs are available. If you are training with two GPUs, use 0,1 instead.

You can check how many GPUs are installed with nvidia-smi.

Using Baidu's PaddleDetection object detection toolkit as an example, the process below shows how to prepare a VOC-format dataset, adjust the configuration, and run training, evaluation, export, and inference.

Project layout

This is the repository structure on GitHub:

GitHub project structure

The parts used here are mainly organized like this:

Basic structure

Dataset setup

The example dataset is car_train, stored in VOC format.

VOC dataset example

JPEGImages contains all image files, and Annotations contains the annotation XML files.

You can create annotations with labelimg. Once the dataset has been organized in this format, model training can be prepared.

Before training

The model used here is YOLOv3. Its configuration files can be found under PaddleDetection/configs/ after cloning PaddleDetection.

However, there is no ready-made yolov3_mobilenet_v3_small_270e_voc file in that directory.

YOLOv3 config files

A simple way to handle this is to copy yolov3_mobilenet_v3_large_270e_voc.yml and rename it to yolov3_mobilenet_v3_small_270e_voc.yml:

cp yolov3_mobilenet_v3_large_270e_voc.yml yolov3_mobilenet_v3_small_270e_voc.yml

Then edit the copied small_270e configuration.

Before editing:

Before modification

After editing:

After modification

After that, modify voc.yml in the datasets directory located alongside the YOLOv3 configs.

YOLOv3 datasets directory

voc.yml configuration

A few fields here matter especially:

  • dataset_dir must point to your dataset path.
  • num_classes must match the number of labels exactly.
  • anno_path under TrainDataset is the file used during training.
  • anno_path under EvalDataset is the file used during evaluation.
  • label_list points to the category label file.

Configuration example:

metric: VOC map_type: 11point num_classes: 8 TrainDataset: !VOCDataSet dataset_dir: dataset/car_train anno_path: train.txt label_list: labels.txt data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult'] EvalDataset: !VOCDataSet dataset_dir: dataset/car_train anno_path: eval.txt label_list: labels.txt data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult'] TestDataset: !ImageFolder anno_path: dataset/car_train/labels.txt

Starting training

Once the following pieces are ready:

  • PaddleDetection has been downloaded
  • the dataset is prepared, including image files, annotation XML files, train.txt, eval.txt, and labels.txt
  • the training config has been adjusted

training can begin.

Train the model

python tools/train.py -c configs/yolov3/yolov3_mobilenet_v3_small_270e_voc.yml --use_vdl=True --eval

Resume interrupted training

python tools/train.py -c configs/yolov3/yolov3_mobilenet_v3_small_270e_voc.yml -r output/yolov3_mobilenet_v3_small_270e_voc/100

The 100 here should be replaced with the checkpoint where training stopped. For example, if the previous run was interrupted at epoch 19, then use 19 instead.

That checkpoint is located under the weight output directory defined earlier in yolov3_mobilenet_v3_small_270e_voc.yml.

Checkpoint example

Evaluate the model

python tools/eval.py -c configs/yolov3/yolov3_mobilenet_v3_small_270e_voc.yml -o weights=output/yolov3_mobilenet_v3_small_270e_voc/best_model

Export the model

python tools/export_model.py -c configs/yolov3/yolov3_mobilenet_v3_small_270e_voc.yml --output_dir=./inference_model -o weights=output/yolov3_mobilenet_v3_small_270e_voc/best_model

Run inference

python deploy/python/infer.py --model_dir=./inference_model/yolov3_mobilenet_v3_small_270e_voc --image_file=./street.jpg --device=GPU --threshold=0.2

File formats and helper script

train.txt format

The first column is the image path, and the second column is the corresponding XML annotation path.

JPEGImages/4457.jpg Annotations/4457.xml JPEGImages/212.jpg Annotations/212.xml JPEGImages/642.jpg Annotations/642.xml

eval.txt format

Same format as train.txt.

labels.txt format

bump cone bridge granary CrossWalk tractor corn pig

randlist.py

```import os import random import xml.dom.minidom lst = ['bump', 'cone', 'bridge', 'granary', 'CrossWalk', 'tractor', 'corn', 'pig'] def ReadFileDatas(): FileNamelist = [] file = open('train.txt','r+') for line in file: line = line.strip('\n') FileNamelist.append(line) #print('len ( FileNamelist ) = ' ,len(FileNamelist)) file.close() return FileNamelist

def WriteDatasToFile(listInfo): file_handle_train = open('train.txt',mode='w') file_handle_eval = open("eval.txt",mode='w') i = 0 for idx in range(len(listInfo)): str = listInfo[idx] ndex = str.rfind('_') str_Result = str + '\n' if(i%6 != 0): file_handle_train.write(str_Result) else: file_handle_eval.write(str_Result) i += 1 file_handle_train.close() file_handle_eval.close() path = './Annotations/' res = os.listdir(path) def WriteDataToFile(DataList): file_handle_train = open('train.txt',mode='w') file_handle_eval = open("eval.txt",mode='w') i = 0 for idx in range(len(DataList)): str = DataList[idx] if(i%6 != 0): file_handle_train.write(str+'\n') else: file_handle_eval.write(str+'\n') i += 1 file_handle_train.close() file_handle_eval.close() dataList = [] for i in res: #print("./Images/"+ str(i[0:-4:1]) + ".jpg "+ path + str(i)) dataList.append("./JPEGImages/"+ str(i[0:-4:1]) + ".jpg "+ path + str(i)) WriteDataToFile(DataList=dataList) listFileInfo = ReadFileDatas() random.shuffle(listFileInfo)# 打乱 WriteDatasToFile(listFileInfo)```

This script builds train.txt and eval.txt from the files in Annotations, then shuffles the entries and splits them so that every sixth item goes into evaluation while the rest go into training.