Skip to content

Commit b59b387

Browse files
committed
Implementation of DebiasMatch and Debiased Self-Training
semi-supervised learning task
1 parent 2f88159 commit b59b387

File tree

7 files changed

+1128
-14
lines changed

7 files changed

+1128
-14
lines changed

docs/tllib/self_training.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,3 +113,11 @@ FlexMatch
113113
.. autoclass:: tllib.self_training.flexmatch.DynamicThresholdingModule
114114
:members:
115115

116+
.. _DST:
117+
118+
Debiased Self-Training
119+
-----------------------------
120+
121+
.. autoclass:: tllib.self_training.dst.ImageClassifier
122+
123+
.. autoclass:: tllib.self_training.dst.WorstCaseEstimationLoss

examples/semi_supervised_learning/image_classification/README.md

Lines changed: 95 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -34,49 +34,102 @@ Supported methods include:
3434
- [Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks (Pseudo Label, ICML 2013)](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.664.3543&rep=rep1&type=pdf)
3535
- [Temporal Ensembling for Semi-Supervised Learning (Pi Model, ICLR 2017)](https://arxiv.org/abs/1610.02242)
3636
- [Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results (Mean Teacher, NIPS 2017)](https://arxiv.org/abs/1703.01780)
37+
- [Self-Training With Noisy Student Improves ImageNet Classification (Noisy Student, CVPR 2020)](https://openaccess.thecvf.com/content_CVPR_2020/papers/Xie_Self-Training_With_Noisy_Student_Improves_ImageNet_Classification_CVPR_2020_paper.pdf)
3738
- [Unsupervised Data Augmentation for Consistency Training (UDA, NIPS 2020)](https://arxiv.org/pdf/1904.12848v4.pdf)
3839
- [FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence (FixMatch, NIPS 2020)](https://arxiv.org/abs/2001.07685)
39-
- [Self-Tuning for Data-Efficient Deep Learning (self-tuning, ICML 2021)](http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf)
40+
- [Self-Tuning for Data-Efficient Deep Learning (Self-Tuning, ICML 2021)](http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf)
41+
- [FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling (FlexMatch, NIPS 2021)](https://arxiv.org/abs/2110.08263)
42+
- [Debiased Learning From Naturally Imbalanced Pseudo-Labels (DebiasMatch, CVPR 2022)](https://openaccess.thecvf.com/content/CVPR2022/papers/Wang_Debiased_Learning_From_Naturally_Imbalanced_Pseudo-Labels_CVPR_2022_paper.pdf)
43+
- [Debiased Self-Training for Semi-Supervised Learning (DST)](https://arxiv.org/abs/2202.07136)
4044

4145
## Usage
4246

4347
### Semi-supervised learning with supervised pre-trained model
44-
The shell files give the script to train with supervised pre-trained model with specified hyper-parameters.
45-
For example, if you want to train UDA on CIFAR100, use the following script
48+
49+
The shell files give the script to train with supervised pre-trained model with specified hyper-parameters. For example,
50+
if you want to train UDA on CIFAR100, use the following script
4651

4752
```shell script
4853
# Semi-supervised learning on CIFAR100 (ResNet50, 400labels).
4954
# Assume you have put the datasets under the path `data/cifar100`,
5055
# or you are glad to download the datasets automatically from the Internet to this path
5156
CUDA_VISIBLE_DEVICES=0 python uda.py data/cifar100 -d CIFAR100 --train-resizing 'cifar' --val-resizing 'cifar' \
52-
--norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 --finetune --lr 0.01 \
53-
-a resnet50 --seed 0 --log logs/uda/cifar100_4_labels_per_class
57+
--norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 -a resnet50 \
58+
--lr 0.003 --finetune --threshold 0.7 --seed 0 --log logs/uda/cifar100_4_labels_per_class
5459
```
55-
Following common practice in semi-supervised learning, we select a class-balanced subset as the labeled dataset and
56-
treat other samples as unlabeled data. In the above command, `num-samples-per-class` specifies how many
57-
labeled samples for each class. Note that the labeled subset is **deterministic with the same random seed**. Hence, if
58-
you want to compare different algorithms with the same labeled subset, you can simply pass in the same random seed.
60+
61+
Following common practice in semi-supervised learning, we select a class-balanced subset as the labeled dataset and
62+
treat other samples as unlabeled data. In the above command, `num-samples-per-class` specifies how many labeled samples
63+
for each class. Note that the labeled subset is **deterministic with the same random seed**. Hence, if you want to
64+
compare different algorithms with the same labeled subset, you can simply pass in the same random seed.
5965

6066
### Semi-supervised learning with unsupervised pre-trained model
67+
6168
Take MoCo as an example.
6269

6370
1. Download MoCo pretrained checkpoints from https://github.com/facebookresearch/moco
64-
2. Convert the format of the MoCo checkpoints to the standard format of pytorch
71+
2. Convert the format of the MoCo checkpoints to the standard format of pytorch
72+
6573
```shell
6674
mkdir checkpoints
6775
python convert_moco_to_pretrained.py checkpoints/moco_v2_800ep_pretrain.pth.tar checkpoints/moco_v2_800ep_backbone.pth checkpoints/moco_v2_800ep_fc.pth
6876
```
77+
6978
3. Start training
79+
7080
```shell
7181
CUDA_VISIBLE_DEVICES=0 python erm.py data/cifar100 -d CIFAR100 --train-resizing 'cifar' --val-resizing 'cifar' \
72-
--norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 --finetune \
73-
-a resnet50 --seed 0 --log logs/erm_moco_pretrain/cifar100_4_labels_per_class --lr-scheduler cos -i 2000 \
74-
--pretrained-backbone checkpoints/moco_v2_800ep_backbone.pth
82+
--norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 -a resnet50 \
83+
--pretrained-backbone checkpoints/moco_v2_800ep_backbone.pth \
84+
--lr 0.001 --finetune --lr-scheduler cos --seed 0 --log logs/erm_moco_pretrain/cifar100_4_labels_per_class
7585
```
7686

7787
## Experiment and Results
7888

79-
We are running experiments and will release the results soon.
89+
**Notations**
90+
91+
- ``Avg`` is the accuracy reported by `TLlib`.
92+
- ``ERM`` refers to the model trained with only labeled data.
93+
- ``Oracle`` refers to the model trained using all data as labeled data.
94+
95+
Below are the results of implemented methods. Other than _Oracle_, we randomly sample 4 labels per category.
96+
97+
### ImageNet Supervised Pre-training (ResNet-50)
98+
99+
| Methods | Food101 | CIFAR10 | CIFAR100 | CUB200 | Aircraft | Cars | SUN397 | DTD | Pets | Flowers | Caltech | Avg |
100+
|--------------|---------|---------|----------|--------|----------|------|--------|------|------|---------|---------|------|
101+
| ERM | 33.6 | 59.4 | 47.9 | 48.6 | 29.0 | 37.1 | 40.9 | 50.5 | 82.2 | 87.6 | 82.2 | 54.5 |
102+
| Pseudo Label | 36.9 | 62.8 | 52.5 | 54.9 | 30.4 | 40.4 | 41.7 | 54.1 | 89.6 | 93.5 | 85.1 | 58.4 |
103+
| Pi Model | 34.2 | 66.9 | 48.5 | 47.9 | 26.7 | 37.4 | 40.9 | 51.9 | 83.5 | 92.0 | 82.2 | 55.6 |
104+
| Mean Teacher | 40.4 | 78.1 | 58.5 | 52.8 | 32.0 | 45.6 | 40.2 | 53.8 | 86.8 | 92.8 | 83.7 | 60.4 |
105+
| UDA | 41.9 | 73.0 | 59.8 | 55.4 | 33.5 | 42.7 | 42.1 | 49.7 | 88.0 | 93.4 | 85.3 | 60.4 |
106+
| FixMatch | 36.2 | 74.5 | 58.0 | 52.6 | 27.1 | 44.8 | 40.8 | 50.2 | 87.8 | 93.6 | 83.2 | 59.0 |
107+
| Self Tuning | 41.4 | 70.9 | 57.2 | 60.5 | 37.0 | 59.8 | 43.5 | 51.7 | 88.4 | 93.5 | 89.1 | 63.0 |
108+
| FlexMatch | 48.1 | 94.2 | 69.2 | 65.1 | 38.0 | 55.3 | 50.2 | 55.6 | 91.5 | 94.6 | 89.4 | 68.3 |
109+
| DebiasMatch | 57.1 | 92.4 | 69.0 | 66.2 | 41.5 | 65.4 | 48.3 | 54.2 | 90.2 | 95.4 | 89.3 | 69.9 |
110+
| DST | 58.1 | 93.5 | 67.8 | 68.6 | 44.9 | 68.6 | 47.0 | 56.3 | 91.5 | 95.1 | 90.3 | 71.1 |
111+
| Oracle | 85.5 | 97.5 | 86.3 | 81.1 | 85.1 | 91.1 | 64.1 | 68.8 | 93.2 | 98.1 | 92.6 | 85.8 |
112+
113+
### ImageNet Unsupervised Pre-training (ResNet-50, MoCo v2)
114+
115+
| Methods | Food101 | CIFAR10 | CIFAR100 | CUB200 | Aircraft | Cars | SUN397 | DTD | Pets | Flowers | Caltech | Avg |
116+
|--------------|---------|---------|----------|--------|----------|------|--------|------|------|---------|---------|------|
117+
| ERM | 33.5 | 63.0 | 50.8 | 39.4 | 28.1 | 40.3 | 40.7 | 53.7 | 65.4 | 87.5 | 82.8 | 53.2 |
118+
| Pseudo Label | 33.6 | 71.9 | 53.8 | 42.7 | 30.9 | 51.2 | 41.2 | 55.2 | 69.3 | 94.2 | 86.2 | 57.3 |
119+
| Pi Model | 32.7 | 77.9 | 50.9 | 33.6 | 27.2 | 34.4 | 41.1 | 54.9 | 66.7 | 91.4 | 84.1 | 54.1 |
120+
| Mean Teacher | 36.8 | 79.0 | 56.7 | 43.0 | 33.0 | 53.9 | 39.5 | 54.5 | 67.8 | 92.7 | 83.3 | 58.2 |
121+
| UDA | 39.5 | 91.3 | 60.0 | 41.9 | 36.2 | 39.7 | 41.7 | 51.5 | 71.0 | 93.7 | 86.5 | 59.4 |
122+
| FixMatch | 44.3 | 86.1 | 58.0 | 42.7 | 38.0 | 55.4 | 42.4 | 53.1 | 67.9 | 95.2 | 83.4 | 60.6 |
123+
| Self Tuning | 34.0 | 63.6 | 51.7 | 43.3 | 32.2 | 50.2 | 40.7 | 52.7 | 68.2 | 91.8 | 87.7 | 56.0 |
124+
| FlexMatch | 50.2 | 96.6 | 69.2 | 49.4 | 41.3 | 62.5 | 47.2 | 54.5 | 72.4 | 94.8 | 89.4 | 66.1 |
125+
| DebiasMatch | 54.2 | 95.5 | 68.1 | 49.1 | 40.9 | 73.0 | 47.6 | 54.4 | 76.6 | 95.5 | 88.7 | 67.6 |
126+
| DST | 57.1 | 95.0 | 68.2 | 53.6 | 47.7 | 72.0 | 46.8 | 56.0 | 76.3 | 95.6 | 90.1 | 68.9 |
127+
| Oracle | 87.0 | 98.2 | 87.9 | 80.6 | 88.7 | 92.7 | 63.9 | 73.8 | 90.6 | 97.8 | 93.1 | 86.8 |
128+
129+
## TODO
130+
131+
1. support multi-gpu training
132+
2. add training from scratch code and results
80133

81134
## Citation
82135

@@ -104,6 +157,13 @@ If you use these methods in your research, please consider citing.
104157
year={2017}
105158
}
106159
160+
@inproceedings{noisy_student,
161+
title={Self-training with noisy student improves imagenet classification},
162+
author={Xie, Qizhe and Luong, Minh-Thang and Hovy, Eduard and Le, Quoc V},
163+
booktitle={CVPR},
164+
year={2020}
165+
}
166+
107167
@inproceedings{UDA,
108168
title={Unsupervised data augmentation for consistency training},
109169
author={Xie, Qizhe and Dai, Zihang and Hovy, Eduard and Luong, Thang and Le, Quoc},
@@ -124,4 +184,25 @@ If you use these methods in your research, please consider citing.
124184
booktitle={ICML},
125185
year={2021}
126186
}
187+
188+
@inproceedings{FlexMatch,
189+
title={Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling},
190+
author={Zhang, Bowen and Wang, Yidong and Hou, Wenxin and Wu, Hao and Wang, Jindong and Okumura, Manabu and Shinozaki, Takahiro},
191+
booktitle={NeurIPS},
192+
year={2021}
193+
}
194+
195+
@inproceedings{DebiasMatch,
196+
title={Debiased Learning from Naturally Imbalanced Pseudo-Labels},
197+
author={Wang, Xudong and Wu, Zhirong and Lian, Long and Yu, Stella X},
198+
booktitle={CVPR},
199+
year={2022}
200+
}
201+
202+
@article{DST,
203+
title={Debiased Self-Training for Semi-Supervised Learning},
204+
author={Chen, Baixu and Jiang, Junguang and Wang, Ximei and Wang, Jianmin and Long, Mingsheng},
205+
journal={arXiv preprint arXiv:2202.07136},
206+
year={2022}
207+
}
127208
```

0 commit comments

Comments
 (0)