Implementation YOLOv2 using PyTorch

Dataset

This repository was train on the African Wildlife Dataset from Kaggle
The data folder must contain the train and test folders as follows:

├── data 
  ├── train
    ├── class1
      ├── 001.jpg
      ├── 001.txt(xml)
  ├── test 
    ├── class1
      ├── 001.jpg
      ├── 001.txt(xml)
  ├── obj.data
  ├── obj.names

Also, for training, in the data folder there must be obj.data file with some settings

classes = 4
train = data/train
valid = data/test
names = data/obj.names
backup = backup/
file_format = txt
convert_to_yolo = False

And there must be obj.names file with label names.

0 buffalo
1 elephant
2 rhino
3 zebra

In the description files, you can write the coordinates of the bounding boxes in a simple format (x1, y1, x2, y2). After this use the appropriate flag when training. YOLO format is also available and recommended. Format files as follows:

txt

<class> <xmin> <ymin> <xmax> <ymax>

example:

1 207 214 367 487

txt (already converted to yolo)

<class> <xcenter> <ycenter> <width> <height>

example:

1 0.2 0.3 0.15 0.23

xml

example:

<annotation>
<object>
	<name>zebra</name>
	<bndbox>
		<xmin>71</xmin>
		<ymin>60</ymin>
		<xmax>175</xmax>
		<ymax>164</ymax>
	</bndbox>

Training

Before training, you should extract clusters of anchor boxes based on k-means clustering:

python3 clusterAnalysis.py

Note: Set the results obtained below to the anchors variable in train.py and detect.py

average IOU: 0.984
anchor boxes:
[[0.29782801 0.56151986]
[0.50883439 0.68684481]
[0.38934379 0.61309636]
[0.7571919  0.75148152]
[0.19968393 0.27542369]]

Moving on to training

python3 train.py --epochs 100 --learning_rate 1e-5

All training parameters:

--epochs (states: total epochs)

--learning_rate (states: learning rate)

--batch_size (states: batch size)

--weight_decay (states: weight decay)

--yolo_weights (states: path to yolo PyTorch weights)

--darknet_weights (states: path to Darknet19 binary weights, it's base CNN module of YOLOv2)

--multiscale_off (states: disable multi-scale training)

--verbose (states: show all losses and resolution changes)

After training, mAP will be calculated on the train dataloader and the test dataloader.

Note: You can change the thresholds in train.py.

Inference

On video:

python3 detect.py --video --data_test content/video.mp4 --output content/detect.mp4 --weights backup/yolov2.pt

On image:

python3 detect.py --data_test content/image.jpg --output content/detect.jpg --weights backup/yolov2.pt

Additional parameters:

--show (states: show frames during inference)

Note: You can change the thresholds in detect.py.

Comparison

Model	Dataset	Input size _(pixel)	mAP _(@0.5)
YOLOv1 _(Ours⭐)	African Wildlife	448	61
YOLOv2 _(Ours⭐)	African Wildlife	416	72

Dependencies

PyTorch

Version: 1.13.1

Albumentations

Version: 1.3.0

OpenCV

Version: 4.7.0

xmltodict

Version: 0.13.0

Specificity

Training may not be established in the first epochs.
You should be careful with multi-scale training.

References

Original YOLOv2 paper

Darknet19 weights from ImageNet (recommended for all trainings)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
content		content
data		data
model		model
tools		tools
README.md		README.md
clusterAnalysis.py		clusterAnalysis.py
detect.py		detect.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation YOLOv2 using PyTorch

Dataset

Training

Inference

Comparison

Dependencies

Specificity

References

About

Releases

Packages

Languages

AlexeyDate/YOLOv2

Folders and files

Latest commit

History

Repository files navigation

Implementation YOLOv2 using PyTorch

Dataset

Training

Inference

Comparison

Dependencies

Specificity

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages