Skip to content

JIT: Merge all RETURN/THROW blocks#128515

Draft
BoyBaykiller wants to merge 6 commits into
dotnet:mainfrom
BoyBaykiller:deduplicate-all-return-throw-blocks
Draft

JIT: Merge all RETURN/THROW blocks#128515
BoyBaykiller wants to merge 6 commits into
dotnet:mainfrom
BoyBaykiller:deduplicate-all-return-throw-blocks

Conversation

@BoyBaykiller
Copy link
Copy Markdown
Contributor

Fix #128514

tailMergePreds(nullptr) was called once, but my understanding is it needs to be called repeatedly as it only processes one set at at time.

@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 23, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label May 23, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

@AndyAyersMS PTAL.

Comment thread src/coreclr/jit/fgopt.cpp Outdated
// Avoid splitting a return away from a possible tail call
//
if (!block->hasSingleStmt())
if (block->isEmpty())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was here before, but I dont think we actually need it. Because we only accept RETURN or THROW blocks and these should never be empty?

Copy link
Copy Markdown
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this without repeatedly searching all blocks for returns and throws?

Comment thread src/coreclr/jit/fgopt.cpp Outdated
do
{
predInfo.Reset();
for (BasicBlock* const block : Blocks())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set of eligible return and throw blocks never changes, so do we need to repeatedly walk the entire block list here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also think we don't need to, however when I tried to hoist that it caused asserts.
I didn't look further into it because the same approach is already done in iterateTailMerge() and there are also multiple comments arround this code about improving algorithm efficiency.
So I'd prefer properly understanding the entire code and improving efficiency in a separate PR, in the future.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterateTailMerge just walks the preds of a given block, not all blocks.

What asserts did you see?

Copy link
Copy Markdown
Contributor Author

@BoyBaykiller BoyBaykiller May 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterateTailMerge just walks the preds of a given block, not all blocks.

Yeah it happens to be only the preds here and so less of an issue but the fundamental thing of not needing to regenerate the set still applies I think.

What asserts did you see?

else // The statement is in the middle.
{
assert(stmt->GetPrevStmt() != nullptr && stmt->GetNextStmt() != nullptr);
Statement* prev = stmt->GetPrevStmt();
prev->SetNextStmt(stmt->GetNextStmt());
stmt->GetNextStmt()->SetPrevStmt(prev);
}

Took a quick look, the issue might be that we don't remove entries from predInfo after we merged them.
So it will try to merge them a second time on the second iter - and is never making any progress.
Let me see if I can fix it...

…of reinvoking and re-gathering candidates every timme

* hack to suppress positive diffs
@AndyAyersMS
Copy link
Copy Markdown
Member

I recommend keeping refactoring/renaming changes and functionality in separate PRs, otherwise reviews are more likely to miss important things.

Also, does tail merging returns lead to new tail merge opportunities like it does for other blocks (eg should we be populating "retry blocks")?

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

BoyBaykiller commented May 27, 2026

Also, does tail merging returns lead to new tail merge opportunities like it does for other blocks (eg should we be populating "retry blocks")?

Yes, deduplicating return blocks often does expose new opportunities to tail merge. We are already pushing merged blocks to the retryBlocks stack:

// We should try tail merging the cross jump target.
//
retryBlocks.Push(crossJumpTarget);


Here is an example (for myself to harden understanding):

static int Example(bool cond1, bool cond2, ref int x, ref int y)
{
    if (cond1)
    {
        y = 8;
        x = 9;
        return 10; 
    }
    if (cond2)
    {
        y = 8;
        x = 9;
        return 10;
    }
    return 2;
}

First we pull out the return 10; statement:

A set of 2 return/throw blocks end with the same tree
STMT00005 ( 0x017[E--] ... 0x019 )
               [000017] -----------                         *  RETURN    int   
               [000016] -----------                         \--*  CNS_INT   int    10
New Basic Block BB06 [0005] created.
setting likelihood of BB02 -> BB06 to 1
Will cross-jump to newly split off BB06

unlinking STMT00005 ( 0x017[E--] ... 0x019 )
               [000017] -----------                         *  RETURN    int   
               [000016] -----------                         \--*  CNS_INT   int    10
 from BB04
setting likelihood of BB04 -> BB06 to 1
Deduplicated 1 set of return/throw blocks

After that we look at the predecessors of the new return 10; block - more specifically their last statements - and discover they are also the same. So it get's sunken into the return 10; block:

All 2 preds of BB06 end with the same tree, moving
STMT00004 ( 0x013[E--] ... 0x016 )
               [000015] -A-XG------                         *  STOREIND  int   
               [000013] -----------                         +--*  LCL_VAR   byref  V02 arg2         
               [000014] -----------                         \--*  CNS_INT   int    9

unlinking STMT00004 ( 0x013[E--] ... 0x016 )
               [000015] -A-XG------                         *  STOREIND  int   
               [000013] -----------                         +--*  LCL_VAR   byref  V02 arg2         
               [000014] -----------                         \--*  CNS_INT   int    9
 from BB04

unlinking STMT00007 ( 0x006[E--] ... 0x009 )
               [000023] -A-XG------                         *  STOREIND  int   
               [000021] -----------                         +--*  LCL_VAR   byref  V02 arg2         
               [000022] -----------                         \--*  CNS_INT   int    9
 from BB02
Merged 1 set of tails going into BB06

And so one-by-one we work ourselves through the equivalent statements. Regathering predecessors at each step.

Note: For some cases we might be able to consider tails equivalent even though their exact stmt order isnt the same (?), granted they can be re-ordered accordingly.

Update: I just moved de-duplicating return/throw blocks before tail merging and no longer pushing to retryBlocks and that has no diffs.

…f using a BitVec to sparsely mark them as processed

* move de-duplication before tail-merging and then no longer add them to the retry list as it isnt needed
* use stl iterator tag to be able to call std::stable_partition
* and assert to vector indexer
…s in downstream phases because the way we choose the crossJumpVictim is order-dependent and non optimal (for example we'd want to avoid new BBF_NEEDS_GCPOLL)

* also remove the std::reverse - same reason
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: Missing deduplication of RETURN block causes switch recognition to miss JTRUEs

2 participants