Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TorchToLinalg] Implement lowering of torch.aten.rrelu_with_noise and torch.aten.rrelu_with_noise_backward ops #3645

Closed
wants to merge 4 commits into from

Conversation

bosko-syrmia
Copy link
Contributor

I'm not sure if e2e tests that test each of the ops individually are sufficient, since (to my understanding) rrelu_with_noise generates noise tensor in forward, but rrelu_with_noise_backward uses that same tensor to multiply gradOutput as the return value in backprop. Is there a better way to test it with, say simple NN written with Pytorch that is then compiled using the MLIR infra and tested against original Pytorch output. Would that even be necessary?

Also, I had to remove some tests for AtenRreluOp from FX_IMPORTER_XFAIL_SET because they no longer failed after lowering of torch.aten.rrelu_with_noise. I suppose it has something to do with the way ATen implements rrelu by calling rrelu_with_noise with empty tensor:

Tensor rrelu(const Tensor & self, const Scalar& lower, const Scalar& upper, bool training, std::optional<Generator> generator) {
  TORCH_CHECK(lower.to<double>() <= upper.to<double>(), "Lower bound should be less than or equal to the upper bound")
  return at::rrelu_with_noise(self, at::empty_like(self, LEGACY_CONTIGUOUS_MEMORY_FORMAT), lower, upper, training, std::move(generator));
}

@vivekkhandelwal1
Copy link
Collaborator

vivekkhandelwal1 commented Sep 6, 2024

I'm not sure if e2e tests that test each of the ops individually are sufficient, since (to my understanding) rrelu_with_noise generates noise tensor in forward, but rrelu_with_noise_backward uses that same tensor to multiply gradOutput as the return value in backprop. Is there a better way to test it with, say simple NN written with Pytorch that is then compiled using the MLIR infra and tested against original Pytorch output. Would that even be necessary?

Also, I had to remove some tests for AtenRreluOp from FX_IMPORTER_XFAIL_SET because they no longer failed after lowering of torch.aten.rrelu_with_noise. I suppose it has something to do with the way ATen implements rrelu by calling rrelu_with_noise with empty tensor:

Tensor rrelu(const Tensor & self, const Scalar& lower, const Scalar& upper, bool training, std::optional<Generator> generator) {
  TORCH_CHECK(lower.to<double>() <= upper.to<double>(), "Lower bound should be less than or equal to the upper bound")
  return at::rrelu_with_noise(self, at::empty_like(self, LEGACY_CONTIGUOUS_MEMORY_FORMAT), lower, upper, training, std::move(generator));
}

What's the exact issue that you have? Is it that you want the result of the forward op to be used with the test for the backward op?

Edit: If that's the only concern, then you can have a forward op in your test generating the noise value which you can pass to the backward op in the test.

@bosko-syrmia
Copy link
Contributor Author

Yes, that was the issue. I made a test like you proposed, but cannot seem to get close to the golden value output, and I've tried a lot of variations of tests and dimensionalities for input and noise, as well as playing around with implementation itself, to no avail. I assume because of randomized noise it makes it hard to match between two ops, but I'm really not sure where lies the issue: in my implementation or in the way test is formed. Here is test and the output:

class RreluWithNoiseForwardBackwardModule(torch.nn.Module):
def __init__(self):
super().__init__()
@export
@annotate_args(
[
None,
([-1, -1], torch.float32, True),
([-1, -1], torch.float32, True),
([-1, -1], torch.float32, True),
]
)
def forward(self, grad, input, noise):
torch.ops.aten.rrelu_with_noise(
input, noise, lower=0.4, upper=0.6, training=True
)
res = torch.ops.aten.rrelu_with_noise_backward(
grad,
input,
noise,
lower=0.4,
upper=0.6,
training=True,
self_is_result=False,
)
return torch.mean(res), torch.std(res)
@register_test_case(module_factory=lambda: RreluWithNoiseForwardBackwardModule())
def RreluWithNoiseForwardBackwardModule_basic(module, tu: TestUtils):
module.forward(
tu.rand(256, 244),
tu.rand(256, 244, low=-1.0, high=1.0),
tu.rand(256, 244, low=0.4, high=0.6),
)

FAIL - "RreluWithNoiseForwardBackwardModule_basic"
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        @ tuple element 0
        ERROR: value (Tensor with shape=[], dtype=torch.float32, min=+0.2504, max=+0.2504, mean=+0.2504) is not close to golden value (Tensor with shape=[], dtype=torch.float32, min=+0.3761, max=+0.3761, mean=+0.3761)
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        @ tuple element 1
        ERROR: value (Tensor with shape=[], dtype=torch.float32, min=+0.1478, max=+0.1478, mean=+0.1478) is not close to golden value (Tensor with shape=[], dtype=torch.float32, min=+0.2609, max=+0.2609, mean=+0.2609)

@vivekkhandelwal1
Copy link
Collaborator

Yes, that was the issue. I made a test like you proposed, but cannot seem to get close to the golden value output, and I've tried a lot of variations of tests and dimensionalities for input and noise, as well as playing around with implementation itself, to no avail. I assume because of randomized noise it makes it hard to match between two ops, but I'm really not sure where lies the issue: in my implementation or in the way test is formed. Here is test and the output:

class RreluWithNoiseForwardBackwardModule(torch.nn.Module):
def __init__(self):
super().__init__()
@export
@annotate_args(
[
None,
([-1, -1], torch.float32, True),
([-1, -1], torch.float32, True),
([-1, -1], torch.float32, True),
]
)
def forward(self, grad, input, noise):
torch.ops.aten.rrelu_with_noise(
input, noise, lower=0.4, upper=0.6, training=True
)
res = torch.ops.aten.rrelu_with_noise_backward(
grad,
input,
noise,
lower=0.4,
upper=0.6,
training=True,
self_is_result=False,
)
return torch.mean(res), torch.std(res)
@register_test_case(module_factory=lambda: RreluWithNoiseForwardBackwardModule())
def RreluWithNoiseForwardBackwardModule_basic(module, tu: TestUtils):
module.forward(
tu.rand(256, 244),
tu.rand(256, 244, low=-1.0, high=1.0),
tu.rand(256, 244, low=0.4, high=0.6),
)

FAIL - "RreluWithNoiseForwardBackwardModule_basic"
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        @ tuple element 0
        ERROR: value (Tensor with shape=[], dtype=torch.float32, min=+0.2504, max=+0.2504, mean=+0.2504) is not close to golden value (Tensor with shape=[], dtype=torch.float32, min=+0.3761, max=+0.3761, mean=+0.3761)
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        @ tuple element 1
        ERROR: value (Tensor with shape=[], dtype=torch.float32, min=+0.1478, max=+0.1478, mean=+0.1478) is not close to golden value (Tensor with shape=[], dtype=torch.float32, min=+0.2609, max=+0.2609, mean=+0.2609)

Are you sure that the decomposition you've added is functionally/numerically correct? If not, then what you can do is run a PyTorch code having the same decomposition which you added in this PR, and compare its result with the PyTorch op result.

@vivekkhandelwal1
Copy link
Collaborator

Hi @bosko-syrmia, some of the tests are passing and you need to update them in the xfail_set.

@bosko-syrmia
Copy link
Contributor Author

bosko-syrmia commented Sep 30, 2024

Hi Vivek, it's because I hadn't updated my fork. In the process, I messed up my branch so I will probably cherry-pick this work onto a different branch and create new PR and link it here #3748 unless I manage to restore it properly.

@vivekkhandelwal1
Copy link
Collaborator

Hi @bosko-syrmia, I think you can close this PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants