Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements a Loop Fusion Transformation #493

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

kaushikcfd
Copy link
Collaborator

@kaushikcfd kaushikcfd commented Oct 4, 2021

Loopy-flavored loop-fusion transformation corresponding to https://doi.org/10.1007/3-540-57659-2_18.

@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from b390c5a to 47d7fa8 Compare October 31, 2021 23:25
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from 5bef1aa to 214e6b9 Compare November 11, 2021 17:14
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from d844444 to 776ee64 Compare November 17, 2021 17:09
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 4 times, most recently from acbf9a8 to 921ea56 Compare December 10, 2021 05:33
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from a8dfaf0 to 07194cc Compare December 13, 2021 11:34
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 3 times, most recently from 4c89630 to a6171e2 Compare March 10, 2022 16:43
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from 371632d to d46bfd1 Compare May 6, 2022 17:41
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 3 times, most recently from f47d987 to 4e97120 Compare May 18, 2022 18:57
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from 38a6e02 to b57eb9d Compare October 21, 2024 02:40
Comment on lines 582 to 874
raise LoopyError(
f"'{iname}' and '{conflict_iname}' "
"cannot be fused as they can be nested "
"within one another."
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose the CEESD version (see above) is the desired one, it's algorithmically wrong. If we presume the existence of a "fusable-with" relation, and if we also assume that that relation is transitive, then the correct thing to do upon discovering a non-fusable edge within the set of candidates would be to make two new sets of candidates. My understanding is that the net effect of the code there is to chuck the iname for which the conflict has been discovered out of the set and keep going, effectively (if done often enough) discarding one of the two candidate sets unnecessarily.

For specificity, this is the CEESD version I'm discussing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there's no need for the "is_infusible_with" relation and I got rid of it. See inducer/arraycontext#217 for a correct usage of this transformation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaushikcfd Just to make sure I follow: would you expect this recent change to have an effect on the issue we ran into in inducer/meshmode#453? (Short summary of the issue, as I understand it: if the base and quadrature discretizations have the same number of DOFs, the loop fusion code initially treats indices corresponding to both discretizations as candidates for fusion. Then it encounters an instruction that sums over both of them and arbitrarily decides to eliminate one, because they are nested. It hits the else of the first nested check here.) I tried integrating the changes, and it still seems to trigger that check (except it's an error now instead of a warning).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Matt!

would you expect this recent change to have an effect on the issue

No I don't expect this change to have an issue on #453. I was just cleaning up the API here.

it's an error now instead of a warning

The idea of that check is that if loops can be potentially nested within each other they should not be fusion candidates. Additionally, in our array context we pick candidates such that they don't violate this criterion.

We are hitting either one of the 2 issues:

  1. They cannot be nested within each other and the check in the loop fusion implementation is buggy, or,
  2. we are passing incorrect candidates from the FusionArrayContext.

If there's an MWE I'll be happy to take a look.

Copy link
Contributor

@majosm majosm Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks Kaushik! I think we are hitting issue 2 (so we'll just need to distinguish base and quadrature DOF axes).

@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 7 times, most recently from 78ad97a to dbbd55c Compare March 8, 2025 01:48
@kaushikcfd kaushikcfd marked this pull request as ready for review March 8, 2025 02:04
@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 3 times, most recently from 020afc7 to 0d15c6a Compare March 8, 2025 02:27
@majosm
Copy link
Contributor

majosm commented Mar 11, 2025

FYI @kaushikcfd, while I was browsing through this code the other day trying to understand a warning that was being emitted (which turned into inducer/meshmode#453), I spotted a few opportunities to avoid recomputation and speed things up a fair amount in get_kennedy_unweighted_fusion_candidates. Specifically, the calls I noticed that were being repeated were _get_partial_loop_nest_tree_for_fusion, _get_ldg_nodes_from_loopy_insn, and (I think, need to revisit and confirm) get_insn_access_map. If I can find some time this week I'll finish my changes and push them for you take a look at.

@kaushikcfd
Copy link
Collaborator Author

@majosm: Thanks for the potential bottlenecks. I memoized those routines.

@kaushikcfd kaushikcfd force-pushed the loop_fusion branch 2 times, most recently from 60a1d38 to 78174b8 Compare March 19, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants