This page explains how to train your own custom data with YOLOX.
We take an example of fine-tuning YOLOX-S model on VOC dataset to give a more clear guide.
Clone this repo and follow the README to install YOLOX.
Step 1 Prepare your own dataset with images and labels first. For labeling images, you can use tools like Labelme or CVAT.
Step 2 Then, you should write the corresponding Dataset Class which can load images and labels through __getitem__
method. We currently support COCO format and VOC format.
You can also write the Dataset by your own. Let's take the VOC Dataset file for example:
@Dataset.resize_getitem
def __getitem__(self, index):
img, target, img_info, img_id = self.pull_item(index)
if self.preproc is not None:
img, target = self.preproc(img, target, self.input_dim)
return img, target, img_info, img_id
One more thing worth noting is that you should also implement pull_item and load_anno method for the Mosiac
and MixUp
augmentations.
Step 3 Prepare the evaluator. We currently have COCO evaluator and VOC evaluator. If you have your own format data or evaluation metric, you can write your own evaluator.
Step 4 Put your dataset under $YOLOX_DIR/datasets
, for VOC:
ln -s /path/to/your/VOCdevkit ./datasets/VOCdevkit
- The path "VOCdevkit" will be used in your exp file described in next section. Specifically, in
get_data_loader
andget_eval_loader
function.
We put everything involved in a model to one single Exp file, including model setting, training setting, and testing setting.
A complete Exp file is at yolox_base.py. It may be too long to write for every exp, but you can inherit the base Exp file and only overwrite the changed part.
Let's take the VOC Exp file as an example.
We select YOLOX-S
model here, so we should change the network depth and width. VOC has only 20 classes, so we should also change the num_classes
.
These configs are changed in the init()
method:
class Exp(MyExp):
def __init__(self):
super(Exp, self).__init__()
self.num_classes = 20
self.depth = 0.33
self.width = 0.50
self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
Besides, you should also overwrite the dataset
and evaluator
, prepared before training the model on your own data.
Please see get_data_loader, get_eval_loader, and get_evaluator for more details.
Except special cases, we always recommend to use our COCO pretrained weights for initializing the model.
Once you get the Exp file and the COCO pretrained weights we provided, you can train your own model by the following below command:
python tools/train.py -f /path/to/your/Exp/file -d 8 -b 64 --fp16 -o -c /path/to/the/pretrained/weights
or take the YOLOX-S
VOC training for example:
python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 8 -b 64 --fp16 -o -c /path/to/yolox_s.pth.tar
(Don't worry for the different shape of detection head between the pretrained weights and your own model, we will handle it)
As YOLOX is an anchor-free detector with only several hyper-parameters, most of the time good results can be obtained with no changes to the models or training settings. We thus always recommend you first train with all default training settings.
If at first you don't get good results, there are steps you could consider to improve the model.
Model Selection We provide YOLOX-Nano
, YOLOX-Tiny
, and YOLOX-S
for mobile deployments, while YOLOX-M
/L
/X
for cloud or high performance GPU deployments.
If your deployment meets any compatibility issues. we recommend YOLOX-DarkNet53
.
Training Configs If your training overfits early, then you can reduce max_epochs or decrease the base_lr and min_lr_ratio in your Exp file:
# -------------- training config --------------------- #
self.warmup_epochs = 5
self.max_epoch = 300
self.warmup_lr = 0
self.basic_lr_per_img = 0.01 / 64.0
self.scheduler = "yoloxwarmcos"
self.no_aug_epochs = 15
self.min_lr_ratio = 0.05
self.ema = True
self.weight_decay = 5e-4
self.momentum = 0.9
Aug Configs You may also change the degree of the augmentations.
Generally, for small models, you should weak the aug, while for large models or small size of dataset, you may enchance the aug in your Exp file:
# --------------- transform config ----------------- #
self.degrees = 10.0
self.translate = 0.1
self.scale = (0.1, 2)
self.mscale = (0.8, 1.6)
self.shear = 2.0
self.perspective = 0.0
self.enable_mixup = True
Design your own detector You may refer to our Arxiv paper for details and suggestions for designing your own detector.