@@ -34,49 +34,102 @@ Supported methods include:
34
34
- [ Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks (Pseudo Label, ICML 2013)] ( http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.664.3543&rep=rep1&type=pdf )
35
35
- [ Temporal Ensembling for Semi-Supervised Learning (Pi Model, ICLR 2017)] ( https://arxiv.org/abs/1610.02242 )
36
36
- [ Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results (Mean Teacher, NIPS 2017)] ( https://arxiv.org/abs/1703.01780 )
37
+ - [ Self-Training With Noisy Student Improves ImageNet Classification (Noisy Student, CVPR 2020)] ( https://openaccess.thecvf.com/content_CVPR_2020/papers/Xie_Self-Training_With_Noisy_Student_Improves_ImageNet_Classification_CVPR_2020_paper.pdf )
37
38
- [ Unsupervised Data Augmentation for Consistency Training (UDA, NIPS 2020)] ( https://arxiv.org/pdf/1904.12848v4.pdf )
38
39
- [ FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence (FixMatch, NIPS 2020)] ( https://arxiv.org/abs/2001.07685 )
39
- - [ Self-Tuning for Data-Efficient Deep Learning (self-tuning, ICML 2021)] ( http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf )
40
+ - [ Self-Tuning for Data-Efficient Deep Learning (Self-Tuning, ICML 2021)] ( http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf )
41
+ - [ FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling (FlexMatch, NIPS 2021)] ( https://arxiv.org/abs/2110.08263 )
42
+ - [ Debiased Learning From Naturally Imbalanced Pseudo-Labels (DebiasMatch, CVPR 2022)] ( https://openaccess.thecvf.com/content/CVPR2022/papers/Wang_Debiased_Learning_From_Naturally_Imbalanced_Pseudo-Labels_CVPR_2022_paper.pdf )
43
+ - [ Debiased Self-Training for Semi-Supervised Learning (DST)] ( https://arxiv.org/abs/2202.07136 )
40
44
41
45
## Usage
42
46
43
47
### Semi-supervised learning with supervised pre-trained model
44
- The shell files give the script to train with supervised pre-trained model with specified hyper-parameters.
45
- For example, if you want to train UDA on CIFAR100, use the following script
48
+
49
+ The shell files give the script to train with supervised pre-trained model with specified hyper-parameters. For example,
50
+ if you want to train UDA on CIFAR100, use the following script
46
51
47
52
``` shell script
48
53
# Semi-supervised learning on CIFAR100 (ResNet50, 400labels).
49
54
# Assume you have put the datasets under the path `data/cifar100`,
50
55
# or you are glad to download the datasets automatically from the Internet to this path
51
56
CUDA_VISIBLE_DEVICES=0 python uda.py data/cifar100 -d CIFAR100 --train-resizing ' cifar' --val-resizing ' cifar' \
52
- --norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 --finetune --lr 0.01 \
53
- -a resnet50 --seed 0 --log logs/uda/cifar100_4_labels_per_class
57
+ --norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 -a resnet50 \
58
+ --lr 0.003 --finetune --threshold 0.7 -- seed 0 --log logs/uda/cifar100_4_labels_per_class
54
59
```
55
- Following common practice in semi-supervised learning, we select a class-balanced subset as the labeled dataset and
56
- treat other samples as unlabeled data. In the above command, ` num-samples-per-class ` specifies how many
57
- labeled samples for each class. Note that the labeled subset is ** deterministic with the same random seed** . Hence, if
58
- you want to compare different algorithms with the same labeled subset, you can simply pass in the same random seed.
60
+
61
+ Following common practice in semi-supervised learning, we select a class-balanced subset as the labeled dataset and
62
+ treat other samples as unlabeled data. In the above command, ` num-samples-per-class ` specifies how many labeled samples
63
+ for each class. Note that the labeled subset is ** deterministic with the same random seed** . Hence, if you want to
64
+ compare different algorithms with the same labeled subset, you can simply pass in the same random seed.
59
65
60
66
### Semi-supervised learning with unsupervised pre-trained model
67
+
61
68
Take MoCo as an example.
62
69
63
70
1 . Download MoCo pretrained checkpoints from https://github.com/facebookresearch/moco
64
- 2 . Convert the format of the MoCo checkpoints to the standard format of pytorch
71
+ 2 . Convert the format of the MoCo checkpoints to the standard format of pytorch
72
+
65
73
``` shell
66
74
mkdir checkpoints
67
75
python convert_moco_to_pretrained.py checkpoints/moco_v2_800ep_pretrain.pth.tar checkpoints/moco_v2_800ep_backbone.pth checkpoints/moco_v2_800ep_fc.pth
68
76
```
77
+
69
78
3 . Start training
79
+
70
80
``` shell
71
81
CUDA_VISIBLE_DEVICES=0 python erm.py data/cifar100 -d CIFAR100 --train-resizing ' cifar' --val-resizing ' cifar' \
72
- --norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 --finetune \
73
- -a resnet50 --seed 0 --log logs/erm_moco_pretrain/cifar100_4_labels_per_class --lr-scheduler cos -i 2000 \
74
- --pretrained-backbone checkpoints/moco_v2_800ep_backbone.pth
82
+ --norm-mean 0.5071 0.4867 0.4408 --norm-std 0.2675 0.2565 0.2761 --num-samples-per-class 4 -a resnet50 \
83
+ --pretrained-backbone checkpoints/moco_v2_800ep_backbone.pth \
84
+ --lr 0.001 --finetune --lr-scheduler cos --seed 0 --log logs/erm_moco_pretrain/cifar100_4_labels_per_class
75
85
```
76
86
77
87
## Experiment and Results
78
88
79
- We are running experiments and will release the results soon.
89
+ ** Notations**
90
+
91
+ - `` Avg `` is the accuracy reported by ` TLlib ` .
92
+ - `` ERM `` refers to the model trained with only labeled data.
93
+ - `` Oracle `` refers to the model trained using all data as labeled data.
94
+
95
+ Below are the results of implemented methods. Other than _ Oracle_ , we randomly sample 4 labels per category.
96
+
97
+ ### ImageNet Supervised Pre-training (ResNet-50)
98
+
99
+ | Methods | Food101 | CIFAR10 | CIFAR100 | CUB200 | Aircraft | Cars | SUN397 | DTD | Pets | Flowers | Caltech | Avg |
100
+ | --------------| ---------| ---------| ----------| --------| ----------| ------| --------| ------| ------| ---------| ---------| ------|
101
+ | ERM | 33.6 | 59.4 | 47.9 | 48.6 | 29.0 | 37.1 | 40.9 | 50.5 | 82.2 | 87.6 | 82.2 | 54.5 |
102
+ | Pseudo Label | 36.9 | 62.8 | 52.5 | 54.9 | 30.4 | 40.4 | 41.7 | 54.1 | 89.6 | 93.5 | 85.1 | 58.4 |
103
+ | Pi Model | 34.2 | 66.9 | 48.5 | 47.9 | 26.7 | 37.4 | 40.9 | 51.9 | 83.5 | 92.0 | 82.2 | 55.6 |
104
+ | Mean Teacher | 40.4 | 78.1 | 58.5 | 52.8 | 32.0 | 45.6 | 40.2 | 53.8 | 86.8 | 92.8 | 83.7 | 60.4 |
105
+ | UDA | 41.9 | 73.0 | 59.8 | 55.4 | 33.5 | 42.7 | 42.1 | 49.7 | 88.0 | 93.4 | 85.3 | 60.4 |
106
+ | FixMatch | 36.2 | 74.5 | 58.0 | 52.6 | 27.1 | 44.8 | 40.8 | 50.2 | 87.8 | 93.6 | 83.2 | 59.0 |
107
+ | Self Tuning | 41.4 | 70.9 | 57.2 | 60.5 | 37.0 | 59.8 | 43.5 | 51.7 | 88.4 | 93.5 | 89.1 | 63.0 |
108
+ | FlexMatch | 48.1 | 94.2 | 69.2 | 65.1 | 38.0 | 55.3 | 50.2 | 55.6 | 91.5 | 94.6 | 89.4 | 68.3 |
109
+ | DebiasMatch | 57.1 | 92.4 | 69.0 | 66.2 | 41.5 | 65.4 | 48.3 | 54.2 | 90.2 | 95.4 | 89.3 | 69.9 |
110
+ | DST | 58.1 | 93.5 | 67.8 | 68.6 | 44.9 | 68.6 | 47.0 | 56.3 | 91.5 | 95.1 | 90.3 | 71.1 |
111
+ | Oracle | 85.5 | 97.5 | 86.3 | 81.1 | 85.1 | 91.1 | 64.1 | 68.8 | 93.2 | 98.1 | 92.6 | 85.8 |
112
+
113
+ ### ImageNet Unsupervised Pre-training (ResNet-50, MoCo v2)
114
+
115
+ | Methods | Food101 | CIFAR10 | CIFAR100 | CUB200 | Aircraft | Cars | SUN397 | DTD | Pets | Flowers | Caltech | Avg |
116
+ | --------------| ---------| ---------| ----------| --------| ----------| ------| --------| ------| ------| ---------| ---------| ------|
117
+ | ERM | 33.5 | 63.0 | 50.8 | 39.4 | 28.1 | 40.3 | 40.7 | 53.7 | 65.4 | 87.5 | 82.8 | 53.2 |
118
+ | Pseudo Label | 33.6 | 71.9 | 53.8 | 42.7 | 30.9 | 51.2 | 41.2 | 55.2 | 69.3 | 94.2 | 86.2 | 57.3 |
119
+ | Pi Model | 32.7 | 77.9 | 50.9 | 33.6 | 27.2 | 34.4 | 41.1 | 54.9 | 66.7 | 91.4 | 84.1 | 54.1 |
120
+ | Mean Teacher | 36.8 | 79.0 | 56.7 | 43.0 | 33.0 | 53.9 | 39.5 | 54.5 | 67.8 | 92.7 | 83.3 | 58.2 |
121
+ | UDA | 39.5 | 91.3 | 60.0 | 41.9 | 36.2 | 39.7 | 41.7 | 51.5 | 71.0 | 93.7 | 86.5 | 59.4 |
122
+ | FixMatch | 44.3 | 86.1 | 58.0 | 42.7 | 38.0 | 55.4 | 42.4 | 53.1 | 67.9 | 95.2 | 83.4 | 60.6 |
123
+ | Self Tuning | 34.0 | 63.6 | 51.7 | 43.3 | 32.2 | 50.2 | 40.7 | 52.7 | 68.2 | 91.8 | 87.7 | 56.0 |
124
+ | FlexMatch | 50.2 | 96.6 | 69.2 | 49.4 | 41.3 | 62.5 | 47.2 | 54.5 | 72.4 | 94.8 | 89.4 | 66.1 |
125
+ | DebiasMatch | 54.2 | 95.5 | 68.1 | 49.1 | 40.9 | 73.0 | 47.6 | 54.4 | 76.6 | 95.5 | 88.7 | 67.6 |
126
+ | DST | 57.1 | 95.0 | 68.2 | 53.6 | 47.7 | 72.0 | 46.8 | 56.0 | 76.3 | 95.6 | 90.1 | 68.9 |
127
+ | Oracle | 87.0 | 98.2 | 87.9 | 80.6 | 88.7 | 92.7 | 63.9 | 73.8 | 90.6 | 97.8 | 93.1 | 86.8 |
128
+
129
+ ## TODO
130
+
131
+ 1 . support multi-gpu training
132
+ 2 . add training from scratch code and results
80
133
81
134
## Citation
82
135
@@ -104,6 +157,13 @@ If you use these methods in your research, please consider citing.
104
157
year={2017}
105
158
}
106
159
160
+ @inproceedings{noisy_student,
161
+ title={Self-training with noisy student improves imagenet classification},
162
+ author={Xie, Qizhe and Luong, Minh-Thang and Hovy, Eduard and Le, Quoc V},
163
+ booktitle={CVPR},
164
+ year={2020}
165
+ }
166
+
107
167
@inproceedings{UDA,
108
168
title={Unsupervised data augmentation for consistency training},
109
169
author={Xie, Qizhe and Dai, Zihang and Hovy, Eduard and Luong, Thang and Le, Quoc},
@@ -124,4 +184,25 @@ If you use these methods in your research, please consider citing.
124
184
booktitle={ICML},
125
185
year={2021}
126
186
}
187
+
188
+ @inproceedings{FlexMatch,
189
+ title={Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling},
190
+ author={Zhang, Bowen and Wang, Yidong and Hou, Wenxin and Wu, Hao and Wang, Jindong and Okumura, Manabu and Shinozaki, Takahiro},
191
+ booktitle={NeurIPS},
192
+ year={2021}
193
+ }
194
+
195
+ @inproceedings{DebiasMatch,
196
+ title={Debiased Learning from Naturally Imbalanced Pseudo-Labels},
197
+ author={Wang, Xudong and Wu, Zhirong and Lian, Long and Yu, Stella X},
198
+ booktitle={CVPR},
199
+ year={2022}
200
+ }
201
+
202
+ @article{DST,
203
+ title={Debiased Self-Training for Semi-Supervised Learning},
204
+ author={Chen, Baixu and Jiang, Junguang and Wang, Ximei and Wang, Jianmin and Long, Mingsheng},
205
+ journal={arXiv preprint arXiv:2202.07136},
206
+ year={2022}
207
+ }
127
208
```
0 commit comments