Thanks for the interesting work.
Regarding the dynamic merging, why should average the predicted coeffs over the batch dimension? Although this gives higher efficiency, shouldn't the dynamic merging be 'sample-wise'?
In such case, I think the sample order will affect the prediction? or also other factors involving the batch dimension will have effect.
|
merged_model = self._apply_tv( |
|
list(dataset_group), |
|
coefficients=dataset_coeffs[ |
|
assigned_sample_idxs[:, None], dataset_group_idxs |
|
].mean(dim=0), |
Thanks for the interesting work.
Regarding the dynamic merging, why should average the predicted coeffs over the batch dimension? Although this gives higher efficiency, shouldn't the dynamic merging be 'sample-wise'?
In such case, I think the sample order will affect the prediction? or also other factors involving the batch dimension will have effect.
mass/src/mass/pl_module/mass.py
Lines 143 to 147 in 583b35e