Hello,
I notice that you just take (X,E,y) as the inputs of regressor, but why don't you just calculate the extrafeatures like the DiscreteDiffusion model and take (noisy_data,extra_features) as inputs? I think the gradients could propagate back to (X,E) after you calculate the extra features with some matrix operations. Is there any reason for you to take only the original features as input?
Hello,
I notice that you just take (X,E,y) as the inputs of regressor, but why don't you just calculate the extrafeatures like the DiscreteDiffusion model and take (noisy_data,extra_features) as inputs? I think the gradients could propagate back to (X,E) after you calculate the extra features with some matrix operations. Is there any reason for you to take only the original features as input?