You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@DHPO You are right. In the paper, we define the M as the number of all bins. And in the latest version of our code, we choose the number of valid (non-empty) bins.
Suppose that you have 100 bins, and all the examples have the same gradient norm of 0.8 (although this is impossible in practice). Then each example will get a harmonizing parameter of 1/100 according to the original equation. And when the bin number is 10000, the parameter will become 1/10000. But in these cases, we would like to use a harmonizing parameter of 1 for all examples since they should be equally treated and should not be down-weighted. And the harmonizing parameters should not depend on the bin numbers. So we think the number of valid bins is more reasonable.
Thank you for reading the code and paper so carefully.
Why you divide
weights
by nonempty bins (n
) rather than all bins(self.bins
)?GHM_Detection/mmdetection/mmdet/core/loss/ghm_loss.py
Line 54 in 3647287
I think
M
is the amount of all bins in the paper. Am I missing something?The text was updated successfully, but these errors were encountered: