This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos), a vision foundation model released by Meta AI . We are continuously improving the project. Welcome to PR the works (papers, repos) that are missed.
- SAM2 [🔗 Code | 🖥️ Demo | 📖 Explanation]
- SAM [🔗 Code | 🖥️ Demo | 📖 Explanation]
- Surveys & Reviews
- Traditional Segmentation Tasks
- Medical Domain
- Camouflaged Object Detection (COD)
- Audio-visual segmentation (AVS)
- Remote Sensing
- Mesh or Point Cloud / 3D Processing
- Image or Video Generation & Editing
- Simultaneous Localization and Mapping (SLAM / VO)
- Light Field Segmentation
- Robotics
- Adaptation, Compression & Edge Applications
- Training
- Performance Evaluations
- Robustness
- Unique Applications/Usage
Release | Title | Code |
---|---|---|
2025.02 | Audio visual segmentation through text embeddings | NA |
Release | Title | Code |
---|---|---|
2025.03 | Universal Scene Graph Generation | 🌐Project page |
Release | Title | Code |
---|---|---|
2024.11 | Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting | 🔗 Code |
Release | Title | Code |
---|---|---|
2025.04 | Robust SAM: On the Adversarial Robustness of Vision Foundation Models | NA |