FOTS: Fast Oriented Text Spotting with a Unified Network

I am still working on this repo. updates and detailed instructions are coming soon!

Table of Contens

TensorFlow Versions
Other Requirements
Trained Models
Datasets
Train
- Pre-train with SynthText
- Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013
Test
References

TensorFlow Versions

As for now, the pre-training code is tested on TensorFlow 1.12, 1.14 and 1.15. I may try to implement 2.x version in the future.

Other Requirements

GCC >= 6

Trained Models

tmp pre-trained model
trained model comming soon

Datasets

pre-training
Synth800k(The dataset is only available for non-commercial research and educational purposes)
finetuning
ICDAR 2015, 2017MLT, 2013

Train

Pre-train with SynthText

Download pre-trained ResNet-50 from TensorFlow-Slim image classification model library page and place it at 'ckpt/resnet_v1_50' dir.

cd ckpt/resnet_v1_50
wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
tar -zxvf resnet_v1_50_2016_08_28.tar.gz
rm resnet_v1_50_2016_08_28.tar.gz

Download Synth800k dataset and place it at data/SynthText/ dir to pre-train the whole net.
Transform(Pre-process) the SynthText data into the ICDAR data format.

python data_provider/SynthText2ICDAR.py

Train with SynthText for 10 epochs(with 1 GPU).

python train.py \
  --max_steps=715625 \
  --gpu_list='0' \
  --checkpoint_path=ckpt/synthText_10eps/ \
  --pretrained_model_path=ckpt/resnet_v1_50/resnet_v1_50.ckpt \
  --training_img_data_dir=data/SynthText/ \
  --training_gt_data_dir=data/SynthText/ \
  --icdar=False \

Visualize pre-pretraining progress with TensorBoard.

tensorboard --logdir=ckpt/synthText_10eps/

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

(if you are using the pre-trained model, place all of the files in ckpt/synthText_10eps/)

Combine ICDAR data before training.
1. Place ICDAR data under tmp/ foler.
2. Run the following script to combine the data.
```
python combine_ICDAR_data.py --year [year of ICDAR to train(13 or 15 or 17)]
```

ICDAR 2017 MLT/pre-finetune for ICDAR 2013 or ICDAR 2015 (text detection task only)

Train the pre-trained model with 9,000 images from ICDAR 2017 MLT training and validation datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR17MLT/ \
  --pretrained_model_path=ckpt/synthText_10eps/ \
  --train_stage=0 \
  --training_img_data_dir=data/ICDAR17MLT/imgs/ \
  --training_gt_data_dir=data/ICDAR17MLT/gts/

ICDAR 2015

Train the model with 1,000 images from ICDAR 2015 training dataset and 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR15/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR15+13/imgs/ \
  --training_gt_data_dir=data/ICDAR15+13/gts/

ICDAR 2013(horizontal text only)

Train the model with 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR13/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR13/imgs/ \
  --training_gt_data_dir=data/ICDAR13/gts/

Test

Place some images in test_imgs/ dir and specify a trained checkpoint path to see the test result.

python test.py --test_data_path test_imgs/ --checkpoint_path [checkpoint path]

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
ckpt		ckpt
data		data
data_provider		data_provider
imgs		imgs
lanms		lanms
module		module
nets		nets
notebooks		notebooks
test_imgs		test_imgs
tmp		tmp
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bktree.py		bktree.py
config.py		config.py
locality_aware_nms.py		locality_aware_nms.py
test.py		test.py
train.py		train.py
train_synthText_10eps.sh		train_synthText_10eps.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FOTS: Fast Oriented Text Spotting with a Unified Network

Table of Contens

TensorFlow Versions

Other Requirements

Trained Models

Datasets

Train

Pre-train with SynthText

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

Test

References

About

Releases

Packages

Contributors 2

Languages

License

Masao-Taketani/FOTS_OCR

Folders and files

Latest commit

History

Repository files navigation

FOTS: Fast Oriented Text Spotting with a Unified Network

Table of Contens

TensorFlow Versions

Other Requirements

Trained Models

Datasets

Train

Pre-train with SynthText

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

Test

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages