About scale ambiguity

Hello, thank you for this fascinating work.  I’m new to this area, and I had a question while reading your paper.

In Table 2 of the main paper, the "No Depth Encoding" method seems to leave the scale ambiguity problem completely unresolved from a theoretical standpoint — is that a correct understanding?

My interpretation is that this method still uses the Epipolar Encoder, but when computing s in Equation 1, it omits the concatenation with r(d). If that's the case, the model leverages epipolar line information, but the epipolar line itself is defined only up to scale (is that right?). So even if the model is able to accurately find corresponding pixel pairs along epipolar lines, the scale ambiguity issue would still remain. In other words, given only the information available to the model, there seems to be no way to recover the scale of the SfM reconstruction used for training.
That said, the fact that the "No Depth Encoding" variant still achieves quite strong performance in Table 2 is surprising. Could it be that the scale of the training/test datasets is constrained within a specific range?

I’d really appreciate any clarification. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About scale ambiguity #116

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

About scale ambiguity #116

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions