- Transformer: Attention Is All You Need [Notes] NIPS 2017
- DETR: End-to-End Object Detection with Transformers [Notes] ECCV 2020 oral [FAIR]
- STSU: Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images [Notes] ICCV 2021 [BEV feat stitching, Luc Van Gool]
- DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries ICCV 2021 [BEVNet, transformers]
- Translating Images into Maps [BEVNet, transformers]
- PYVA: Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation [Notes] CVPR 2021 [Supplementary] [BEVNet]
- NEAT: Neural Attention Fields for End-to-End Autonomous Driving [Notes] ICCV 2021 [supplementary] [BEVNet]