Skip to content

Commit 42ef0f9

Browse files
initial commit
1 parent 6d8b4ff commit 42ef0f9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+8275
-2
lines changed

.gitmodules

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
[submodule "submodules/Mask2Former"]
2+
path = submodules/Mask2Former
3+
url = https://github.com/facebookresearch/Mask2Former.git
4+
[submodule "submodules/gsplat"]
5+
path = submodules/gsplat
6+
url = https://github.com/nerfstudio-project/gsplat.git

README.md

Lines changed: 171 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,171 @@
1-
# STDLoc
2-
[CVPR2025] From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
1+
<br>
2+
<p align="center">
3+
<img src="assets/logo.jpg" style="height:70px"></img>
4+
<h1 align="center"><strong>From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting</strong></h1>
5+
<p align="center">
6+
<a href='' target='_blank'>Zhiwei Huang<sup>1,2</sup><sup>*</sup></a>&emsp;
7+
<a href='' target='_blank'>Hailin Yu<sup>2</sup><sup>*</sup><sup>&dagger;</sup></a>&emsp;
8+
<a href='' target='_blank'>Yichun Shentu<sup>2</sup></a>&emsp;
9+
<a href='' target='_blank'>Jin Yuan<sup>2</sup></a>&emsp;
10+
<a href='' target='_blank'>Guofeng Zhang<sup>1,2</sup><sup>&dagger;</sup></a>&emsp;
11+
<br>
12+
<sup>1</sup>State Key Lab of CAD&CG, Zhejiang University&emsp;<sup>2</sup>SenseTime Research
13+
<br>
14+
<sup>*</sup> Equal Contribution
15+
<sup>&dagger;</sup> Corresponding Authors
16+
<br>
17+
<strong style="font-size: 20px; color:rgb(219, 39, 119);"> CVPR2025 </strong>
18+
</p>
19+
</p>
20+
21+
<p align="center">
22+
<a href="" target='_**blank**'>
23+
<img src="https://img.shields.io/badge/arXiv-None-blue?">
24+
</a>
25+
<a href="" target='_blank'>
26+
<img src="https://img.shields.io/badge/Paper-📖-blue?">
27+
</a>
28+
<a href="https://zju3dv.github.io/STDLoc/" target='_blank'>
29+
<img src="https://img.shields.io/badge/Project-&#x1F680-blue">
30+
</a>
31+
<a href="" target='_blank'>
32+
<img src="https://visitor-badge.laobi.icu/badge?page_id=zju3dv.STDLoc">
33+
</a>
34+
</p>
35+
36+
## 🏠 About
37+
<div style="text-align: center;">
38+
<img src="assets/STDLoc_pipeline.png" alt="Dialogue_Teaser" width=100% >
39+
</div>
40+
This paper presents a novel camera relocalization method, <b>STDLoc</b>, which leverages Feature GS as scene representation. STDLoc is a full relocalization pipeline that can achieve accurate relocalization without relying on any pose prior. Unlike previous coarse-to-fine localization methods that require image retrieval first and then feature matching, we propose a novel sparse-to-dense localization paradigm. Based on this scene representation, we introduce a novel matching-oriented Gaussian sampling strategy and a scene-specific detector to achieve efficient and robust initial pose estimation. Furthermore, based on the initial localization results, we align the query feature map to the Gaussian feature field by dense feature matching to enable accurate localization. The experiments on indoor and outdoor datasets show that <b>STDLoc outperforms current state-of-the-art localization methods in terms of localization accuracy and recall</b>.
41+
42+
43+
<!-- contents with emoji -->
44+
<!-- ## 📋 Contents
45+
- [🔍 Overview](#-overview)
46+
- [📦 Training and Evaluation](#-training-and-evaluation)
47+
- [🔗 Citation](#-citation)
48+
- [👏 Acknowledgements](#-acknowledgements) -->
49+
50+
## 🔍 Performance
51+
52+
53+
The code in this repository has a better performance than our paper, through some small fix:
54+
1. Set ```align_corners=False``` in interpolation.
55+
2. Use a smaller learning rate for ourdoor dataset.
56+
3. Use the anti-aliasing feature of gsplat.
57+
58+
#### 7-Scenes
59+
| Method | Chess | Fire | Heads | Office | Pumpkin | Redkitchen | Stairs | Avg.↓[cm/◦] |
60+
|---|---|---|---|---|---|---|---|---|
61+
| STDLoc (paper) | 0.46/0.15 | 0.57/0.24 | 0.45/0.26 | 0.86/0.24 | 0.93/0.21 | 0.63/0.19 | 1.42/0.41 | 0.76/0.24 |
62+
| STDLoc (repo) | 0.42/0.13 | 0.49/0.2 | 0.41/0.26 | 0.74/0.21 | 0.89/0.23 | 0.57/0.14 | 1.18/0.35 | 0.67/0.22 |
63+
64+
65+
66+
#### Cambridge Landmarks
67+
| Methods | Court | King’s | Hospital | Shop | St. Mary’s | Avg.↓[cm/◦] |
68+
|---|---|---|---|---|---|---|
69+
| STDLoc (paper) | 15.7/0.06 | 15.0/0.17 | 11.9/0.21 | 3.0/0.13 | 4.7/0.14 | 10.1/0.14 |
70+
| STDLoc (repo) | 11.3/0.05 | 15.0/0.15 | 11.3/0.21 | 2.5/0.12 | 3.6/0.12 | 8.7/0.13 |
71+
72+
## 📦 Training and Evaluation
73+
### Environment Setup
74+
75+
1. Clone this repository.
76+
```bash
77+
git clone --recursive https://github.com/zju3dv/STDLoc.git
78+
```
79+
2. Install packages
80+
```bash
81+
conda create -n stdloc python=3.8 -y
82+
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
83+
pip install -r requirements.txt
84+
# install gsplat
85+
cd submodules/gsplat
86+
pip install -e .
87+
cd ../..
88+
```
89+
90+
### Data Preparation
91+
We use two public datasets:
92+
- [Microsoft 7-Scenes](https://www.microsoft.com/en-us/research/project/rgb-d-dataset-7-scenes/)
93+
- [Cambridge Landmarks](https://www.repository.cam.ac.uk/handle/1810/251342/)
94+
95+
#### 7-Scenes Dataset
96+
1. Download images follow HLoc.
97+
```bash
98+
export dataset=datasets/7scenes
99+
for scene in chess fire heads office pumpkin redkitchen stairs; \
100+
do wget http://download.microsoft.com/download/2/8/5/28564B23-0828-408F-8631-23B1EFF1DAC8/$scene.zip -P $dataset \
101+
&& unzip $dataset/$scene.zip -d $dataset && unzip $dataset/$scene/'*.zip' -d $dataset/$scene; done
102+
```
103+
104+
2. Download Full Reconstructions
105+
from [visloc_pseudo_gt_limitations](https://github.com/tsattler/visloc_pseudo_gt_limitations/tree/main?tab=readme-ov-file#full-reconstructions):
106+
```bash
107+
pip install gdown
108+
gdown 1ATijcGCgK84NKB4Mho4_T-P7x8LSL80m $dataset/7scenes_reference_models.zip
109+
unzip $dataset/7scenes_reference_models.zip -d $dataset
110+
# move sfm_gt to each dataset
111+
for scene in chess fire heads office pumpkin redkitchen stairs; \
112+
do mkdir -p $dataset/$scene/sparse && cp -r $dataset/7scenes_reference_models/$scene/sfm_gt $dataset/$scene/sparse/0 ; done
113+
```
114+
115+
<!-- 3. Generate test files -->
116+
117+
#### Cambridge Landmarks Dataset
118+
1. Download the dataset from the PoseNet project page:
119+
```bash
120+
export dataset=datasets/cambridge
121+
export scenes=( "KingsCollege" "OldHospital" "StMarysChurch" "ShopFacade" "GreatCourt" )
122+
export IDs=( "251342" "251340" "251294" "251336" "251291" )
123+
for i in "${!scenes[@]}"; do
124+
wget https://www.repository.cam.ac.uk/bitstream/handle/1810/${IDs[i]}/${scenes[i]}.zip -P $dataset \
125+
&& unzip $dataset/${scenes[i]}.zip -d $dataset ; done
126+
```
127+
128+
129+
2. Install Mask2Former to mask dynamic objects and sky:
130+
```bash
131+
cd submodules/Mask2Former
132+
pip install -r requirements.txt
133+
wget https://dl.fbaipublicfiles.com/maskformer/mask2former/coco/panoptic/maskformer2_swin_large_IN21k_384_bs16_100ep/model_final_f07440.pkl
134+
cd ../..
135+
```
136+
137+
3. Preprocess data:
138+
```bash
139+
bash scripts/dataset_preprocess.sh
140+
```
141+
142+
143+
### Training Feature Gaussian
144+
For 7-Scenes:
145+
```bash
146+
bash scripts/train_7scenes.sh
147+
```
148+
For Cambridge Landmarks:
149+
```bash
150+
bash scripts/train_cambridge.sh
151+
```
152+
### Evaluation
153+
For 7-Scenes:
154+
```bash
155+
bash scripts/evaluate_7scenes.sh
156+
```
157+
For Cambridge Landmarks:
158+
```bash
159+
bash scripts/evaluate_cambridge.sh
160+
```
161+
162+
## 🔗 Citation
163+
164+
```bibtex
165+
166+
```
167+
168+
169+
## 👏 Acknowledgements
170+
- [Feature 3DGS](https://github.com/ShijieZhou-UCLA/feature-3dgs): Our codebase is built upon Feature 3DGS.
171+
- [gsplat](https://github.com/nerfstudio-project/gsplat): We use gsplat as our rasterization backend.

arguments/__init__.py

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
#
2+
# Copyright (C) 2023, Inria
3+
# GRAPHDECO research group, https://team.inria.fr/graphdeco
4+
# All rights reserved.
5+
#
6+
# This software is free for non-commercial, research and evaluation use
7+
# under the terms of the LICENSE.md file.
8+
#
9+
# For inquiries contact [email protected]
10+
#
11+
12+
from argparse import ArgumentParser, Namespace
13+
import sys
14+
import os
15+
16+
class GroupParams:
17+
pass
18+
19+
class ParamGroup:
20+
def __init__(self, parser: ArgumentParser, name : str, fill_none = False):
21+
group = parser.add_argument_group(name)
22+
for key, value in vars(self).items():
23+
shorthand = False
24+
if key.startswith("_"):
25+
shorthand = True
26+
key = key[1:]
27+
t = type(value)
28+
value = value if not fill_none else None
29+
if shorthand:
30+
if t == bool:
31+
group.add_argument("--" + key, ("-" + key[0:1]), default=value, action="store_true")
32+
else:
33+
group.add_argument("--" + key, ("-" + key[0:1]), default=value, type=t)
34+
else:
35+
if t == bool:
36+
group.add_argument("--" + key, default=value, action="store_true")
37+
else:
38+
group.add_argument("--" + key, default=value, type=t)
39+
40+
def extract(self, args):
41+
group = GroupParams()
42+
for arg in vars(args).items():
43+
if arg[0] in vars(self) or ("_" + arg[0]) in vars(self):
44+
setattr(group, arg[0], arg[1])
45+
return group
46+
47+
class ModelParams(ParamGroup):
48+
def __init__(self, parser, sentinel=False):
49+
self.sh_degree = 3
50+
self._source_path = ""
51+
self._feature_type = ""
52+
self._gaussian_type = "3dgs"
53+
self._model_path = ""
54+
self._images = "images"
55+
self._resolution = -1
56+
self._white_background = True
57+
self.longest_edge = 640
58+
self.data_device = "cuda"
59+
self.eval = False
60+
self.speedup = False ###
61+
self.norm_before_render = True
62+
self.render_items = ['RGB', 'Depth', 'Edge', 'Normal', 'Curvature', 'Feature Map']
63+
super().__init__(parser, "Loading Parameters", sentinel)
64+
65+
def extract(self, args):
66+
g = super().extract(args)
67+
g.source_path = os.path.abspath(g.source_path)
68+
return g
69+
70+
class PipelineParams(ParamGroup):
71+
def __init__(self, parser):
72+
self.convert_SHs_python = False
73+
self.compute_cov3D_python = False
74+
self.debug = True
75+
super().__init__(parser, "Pipeline Parameters")
76+
77+
class OptimizationParams(ParamGroup):
78+
def __init__(self, parser):
79+
self.iterations = 30_000
80+
self.position_lr_init = 0.00016
81+
self.position_lr_final = 0.0000016
82+
self.position_lr_delay_mult = 0.01
83+
self.position_lr_max_steps = 30_000
84+
self.feature_lr = 0.0025
85+
self.opacity_lr = 0.05
86+
self.scaling_lr = 0.005
87+
self.rotation_lr = 0.001
88+
#################################################
89+
self.loc_feature_lr = 0.001
90+
#################################################
91+
self.percent_dense = 0.01
92+
self.lambda_dssim = 0.2
93+
self.densification_interval = 100
94+
self.opacity_reset_interval = 3000 ### TRY reset to 100000 but worse
95+
self.densify_from_iter = 500
96+
self.densify_until_iter = 15_000 #6000 ### comapre with 2-stage
97+
self.densify_grad_threshold = 0.0002
98+
super().__init__(parser, "Optimization Parameters")
99+
100+
class OptimizationParams_2dgs(ParamGroup):
101+
def __init__(self, parser):
102+
self.iterations = 30_000
103+
self.position_lr_init = 0.00016
104+
self.position_lr_final = 0.0000016
105+
self.position_lr_delay_mult = 0.01
106+
self.position_lr_max_steps = 30_000
107+
self.feature_lr = 0.0025
108+
self.opacity_lr = 0.05
109+
self.scaling_lr = 0.005
110+
self.rotation_lr = 0.001
111+
#################################################
112+
self.loc_feature_lr = 0.001
113+
#################################################
114+
self.percent_dense = 0.01
115+
self.lambda_dssim = 0.2
116+
self.lambda_dist = 0.0
117+
self.lambda_normal = 0.05
118+
self.opacity_cull = 0.05
119+
self.densification_interval = 100
120+
self.opacity_reset_interval = 3000 ### TRY reset to 100000 but worse
121+
self.densify_from_iter = 500
122+
self.densify_until_iter = 15_000 #6000 ### comapre with 2-stage
123+
self.densify_grad_threshold = 0.0002
124+
super().__init__(parser, "Optimization Parameters")
125+
126+
def get_combined_args(parser : ArgumentParser):
127+
cmdlne_string = sys.argv[1:]
128+
cfgfile_string = "Namespace()"
129+
args_cmdline = parser.parse_args(cmdlne_string)
130+
131+
try:
132+
cfgfilepath = os.path.join(args_cmdline.model_path, "cfg_args")
133+
print("Looking for config file in", cfgfilepath)
134+
with open(cfgfilepath) as cfg_file:
135+
print("Config file found: {}".format(cfgfilepath))
136+
cfgfile_string = cfg_file.read()
137+
except TypeError:
138+
print("Config file not found at")
139+
pass
140+
args_cfgfile = eval(cfgfile_string)
141+
142+
merged_dict = vars(args_cfgfile).copy()
143+
for k,v in vars(args_cmdline).items():
144+
if v != None:
145+
merged_dict[k] = v
146+
return Namespace(**merged_dict)

assets/STDLoc_pipeline.png

859 KB
Loading

assets/logo.jpg

143 KB
Loading

configs/stdloc_7scenes.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
sparse:
2+
nms: 4
3+
detect_num: 4096
4+
mnn_match: False # default False, use topk match
5+
dual_softmax: False
6+
dual_softmax_temp: 0.1
7+
topk: 1
8+
threshold: 0
9+
solver: poselib
10+
confidence: 0.99999
11+
reprojection_error: 12.0
12+
max_iterations: 100000
13+
min_iterations: 1000
14+
detector_path: detector/30000_detector.pth
15+
landmark_path: detector/sampled_idx.pkl
16+
17+
dense:
18+
iters: 4
19+
coarse_dual_softmax_temp: 0.1
20+
fine_dual_softmax_temp: 0.1
21+
coarse_threshold: 0
22+
fine_threshold: 0
23+
solver: poselib
24+
confidence: 0.99999
25+
reprojection_error: 8.0
26+
max_iterations: 1000
27+
min_iterations: 100

configs/stdloc_cambridge.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
sparse:
2+
nms: 4
3+
detect_num: 2048
4+
mnn_match: False # default False, use topk match
5+
dual_softmax: False
6+
dual_softmax_temp: 0.1
7+
topk: 1
8+
threshold: 0
9+
solver: poselib
10+
confidence: 0.99999
11+
reprojection_error: 12.0
12+
max_iterations: 100000
13+
min_iterations: 1000
14+
detector_path: detector/30000_detector.pth
15+
landmark_path: detector/sampled_idx.pkl
16+
17+
dense:
18+
iters: 1
19+
coarse_dual_softmax_temp: 0.1
20+
fine_dual_softmax_temp: 0.1
21+
coarse_threshold: 0
22+
fine_threshold: 0
23+
solver: poselib
24+
confidence: 0.99999
25+
reprojection_error: 12.0
26+
max_iterations: 1000
27+
min_iterations: 100

0 commit comments

Comments
 (0)