-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements a Loop Fusion Transformation #493
base: main
Are you sure you want to change the base?
Conversation
6cfd454
to
21502fd
Compare
b390c5a
to
47d7fa8
Compare
5bef1aa
to
214e6b9
Compare
d844444
to
776ee64
Compare
acbf9a8
to
921ea56
Compare
a8dfaf0
to
07194cc
Compare
07194cc
to
a9cd2eb
Compare
a9cd2eb
to
b79ad3a
Compare
4c89630
to
a6171e2
Compare
371632d
to
d46bfd1
Compare
f47d987
to
4e97120
Compare
38a6e02
to
b57eb9d
Compare
b57eb9d
to
a05f94d
Compare
loopy/transform/loop_fusion.py
Outdated
raise LoopyError( | ||
f"'{iname}' and '{conflict_iname}' " | ||
"cannot be fused as they can be nested " | ||
"within one another." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose the CEESD version (see above) is the desired one, it's algorithmically wrong. If we presume the existence of a "fusable-with" relation, and if we also assume that that relation is transitive, then the correct thing to do upon discovering a non-fusable edge within the set of candidates would be to make two new sets of candidates. My understanding is that the net effect of the code there is to chuck the iname for which the conflict has been discovered out of the set and keep going, effectively (if done often enough) discarding one of the two candidate sets unnecessarily.
For specificity, this is the CEESD version I'm discussing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there's no need for the "is_infusible_with" relation and I got rid of it. See inducer/arraycontext#217 for a correct usage of this transformation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaushikcfd Just to make sure I follow: would you expect this recent change to have an effect on the issue we ran into in inducer/meshmode#453? (Short summary of the issue, as I understand it: if the base and quadrature discretizations have the same number of DOFs, the loop fusion code initially treats indices corresponding to both discretizations as candidates for fusion. Then it encounters an instruction that sums over both of them and arbitrarily decides to eliminate one, because they are nested. It hits the else
of the first nested check here.) I tried integrating the changes, and it still seems to trigger that check (except it's an error now instead of a warning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Matt!
would you expect this recent change to have an effect on the issue
No I don't expect this change to have an issue on #453. I was just cleaning up the API here.
it's an error now instead of a warning
The idea of that check is that if loops can be potentially nested within each other they should not be fusion candidates. Additionally, in our array context we pick candidates such that they don't violate this criterion.
We are hitting either one of the 2 issues:
- They cannot be nested within each other and the check in the loop fusion implementation is buggy, or,
- we are passing incorrect candidates from the FusionArrayContext.
If there's an MWE I'll be happy to take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, thanks Kaushik! I think we are hitting issue 2 (so we'll just need to distinguish base and quadrature DOF axes).
78ad97a
to
dbbd55c
Compare
020afc7
to
0d15c6a
Compare
FYI @kaushikcfd, while I was browsing through this code the other day trying to understand a warning that was being emitted (which turned into inducer/meshmode#453), I spotted a few opportunities to avoid recomputation and speed things up a fair amount in |
@majosm: Thanks for the potential bottlenecks. I memoized those routines. |
60a1d38
to
78174b8
Compare
Loopy-flavored loop-fusion transformation corresponding to https://doi.org/10.1007/3-540-57659-2_18.