本文主要记录无/弱/半监督的一些论文及其核心思想,同时关注其在目标检测领域的应用
paper | pub | main idea |
---|---|---|
Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy.(LoGo) | CVPR2022 | 用MLP来取代cosine-sim作为local-local crops的相似度度量(能抓到更rich的local feature)。可附加于simsiam、moco等模型改善其效果 |
Exploring simple siamese representation learning.(simsiam) | CVPR2021 | 针对一张图片的一对aug-views,交替对one of two branchs进行stop-gradient |
Momentum contrast for unsupervised visual representation learning.(moco) | CVPR2020 | 维护一个queue,存储过去mini-batch的represents,与batchsize decouple并得到一个大dictionary;以momentum的方式平滑更新key-encoder。从current mini-batch中构造positive pairs,从queue中构造negative pairs |
Improved baselines with momentum contrastive learning.(moco v2) | 2020 | |
UniVIP: A Unified Framework for Self-Supervised Visual Pre-training | CVPR2022 | |
Revisiting the Transferability of Supervised Pretraining: an MLP Perspective | CVPR2022 | 预训练时,在encoder后面加MLP可以缓解encoder的overfitting,保留更多的intra-class variantion,改善后续迁移学习的效果 |
In un-/self-supervised representation learning field, methods generally involve certain forms of Siamese networks. An undesired trivial solution to Siamese networks is all outputs “collapsing” to a constant. There have been several general strategies for preventing Siamese networks from collapsing:
- ContraTransformer-basedstive learning: add negative pairs (SimCLR, Deep InfoMax and its multi-scale version, CMC, MoCo, MoCo v2)
- Clustering: incorporates online clustering (SwAV, DeepCluster, SeLa)
- Un-contrastive learning: stop-gradient operation for one branch(SimSiam); a momentum encoder (BYOL)
- Transformer-based* Dino network
paper | pub | main idea |
---|---|---|
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers. (UP-DERT) | CVPR 2021 | |
End-to-end object detection with transformers. (DERT) * | ECCV 2020 | |
Unsupervised embedding learning via invariant and spreading instance feature.(Instance-based discrimination tasks) | IEEE 2019 | |
Unsupervised feature learning via non-parametric instance discrimination. (Instance-based discrimination tasks) | IEEE 2018 | |
Deep clustering for unsupervised learning of visual features. (clustering-based tasks) | ECCV 2018 |
Instancebased discrimination tasks and clustering-based tasks are two typical pretext tasks in recent studies. UP-DETR is a novel pretext task, which aims to pre-train transformers based on the DETR architecture for object detection.
将一张图片x进行randomly crop并做augment后得到两个view: x1,x2 (Transformation),认为这两者similar,作为positive pair. 而数据集中其他所有图片都被认为和x1,x2是dissimilar, 作为negative pair.
(positive/negative定义非常灵活)