C++: Divide number of bounds between branches for phi nodes by paldepind · Pull Request #21329 · github/codeql

paldepind · 2026-02-16T12:56:52Z

This PR changes the heuristic for how the number of bounds is calculated for a guard phi node.

This is an improvement in a few ways:

It avoids a special case where nodes that are both normal phi nodes and guard phi nodes needed special treatment.
I think this heuristic makes sense intuitively.
It fixes a problem where a series of if-else statements with guards on the same variable caused the number of bounds to exponentially increase while the actual range analysis did in fact not introduce any bounds.

The first commit adds an example of this. In the last commit the change to the expected file around test.c:453 to test.c:463 show how we now handle this case better.

This fixes a recent problem observed over in coding standards. See: Revert "C++: Accept test changes after github/codeql#21313." codeql-coding-standards#1041.

For reviewing the comment inside nrOfBoundsPhiGuard should help explain what is going on.

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

…t in expected files

Copilot

Pull request overview

This PR improves the heuristic for calculating the number of bounds for guard phi nodes in C++ range analysis. The change addresses an exponential growth issue that occurred when analyzing series of if-else statements with guards on the same variable.

Changes:

Modified nrOfBoundsPhiGuard function to use a new heuristic: (varBounds + 1) / 2 instead of special-casing certain phi nodes
Added test case repeated_if_else_statements demonstrating the fixed scenario with 11 consecutive if-else statements
Added helper predicates countNrOfLowerBounds and countNrOfUpperBounds in SimpleRangeAnalysisInternal module
Updated test query nrOfBounds.ql to output actual bounds counts alongside estimates

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.

File	Description
`cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll`	Changed bounds estimation heuristic in `nrOfBoundsPhiGuard` from special-casing to dividing by 2; added helper predicates for counting actual bounds
`cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/test.c`	Added `repeated_if_else_statements` function demonstrating the exponential growth scenario that is now fixed
`cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/nrOfBounds.ql`	Enhanced query to also output actual lower and upper bounds counts for validation
`cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/*.expected`	Updated expected test outputs to reflect the improved bounds estimates (line numbers shifted due to new test case)

MathiasVP

The changes looks good to me! I think it would be a good idea to add a consistency check to check whether the estimated number of bounds is an upper bound on the actual number of bounds computed in the recursion. I realize this measure isn't genuinely an upper bound in all cases, but I still think it would be a good check to have. In nothing else, it makes it easy to validate changes to the measurements by running that consistency check on e.g., MRVA

MathiasVP · 2026-02-20T12:02:01Z

cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/test.c

+  if (rhs < 19) { rhs << 1; } else { rhs << 2; }
+  if (rhs < 20) { rhs << 1; } else { rhs << 2; }
+  return rhs; // rhs has 12 bounds
+}


FWIW, I would have appreciated putting this function in the bottom of the file to avoid conflating the diff in the .expected file (since these tests still aren't using inline expectations). No biggie! Just thought I would mention it

I think it fits here as it's related to the repeated_if_statements function just above. You right that the diff gets bigger, but at least the QL changes are in a separate commit with a clean .expected diff :)

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>

jketema

One question and one typo.

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

jketema · 2026-02-20T12:51:22Z

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

+    // by the condition. In this case all lower bounds flow to `{ e1 }` and only
+    // lower bounds that are smaller than `c` flow to `{ e2 }`.


My knowledge of the simple range analysis library, so excuse my ignorance.

In this case all lower bounds flow to { e1 } and only
lower bounds that are smaller than c flow to { e2 }.

This I don't understand, and it's not clear how relate this with the division by 2 you do below. Why do all lower bounds flow to e1 and why is the "smaller than" condition there in the case of e2?

Good question. Maybe it's easiest to explain by making things a bit more concrete.

Suppose the lower bounds for x are {2, 11, 22} and c is the constant 5.

In the true branch we know x < 5. This is an upper bound and thus doesn't affect the lower bounds, hence the "all lower bounds flow to { e1 }".

In the false branch we know x >= 5 which allow us to prevent the lower bounds 11 and 22 from flowing to x (done here). So x's bounds will be {2, 5} hence the "only lower bounds that are smaller than c flow to { e2 }"

At the phi node after the if statement the bounds from both branches are joined and we end up with 5 lower bounds: {2, 5, 11, 22}.

It's important to get that final estimate correct, as inaccuracies there can compound (as in the coding standards test with a gazillion if statements after each other) and dividing by 2 and adding a half gets us there.

In general we can't know how many bounds will exist inside each branch. But the number of bounds will be at most the number of bounds on x and at least 1, so guessing right in the middle reduces how wrong we can be in the worst case.

Does that make sense and should I try and expand on the comment?

Thanks for the explanation. I think it's fine to leave it as is. I clearly just don't have enough background on how the lower and upper bounds interact.

I'm happy to approve once you've fixed the typo I spotted.

Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>

github-actions bot added the C++ label Feb 16, 2026

github-advanced-security bot found potential problems Feb 16, 2026

View reviewed changes

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll Fixed Show fixed Hide fixed

paldepind force-pushed the cpp/simple-range-analysis-phi-divide branch from f041b27 to e5c8f38 Compare February 16, 2026 13:23

paldepind added 3 commits February 16, 2026 14:36

C++: Add simple range analysis test with repeated if-else statements

da527ff

C++: Include the actual number of lower/upper bounds for added contex…

032c7ea

…t in expected files

C++: Divide nr of bounds between branches for phi nodes

d0681c6

paldepind force-pushed the cpp/simple-range-analysis-phi-divide branch from e5c8f38 to d0681c6 Compare February 16, 2026 13:36

paldepind mentioned this pull request Feb 16, 2026

Revert "C++: Accept test changes after github/codeql#21313." github/codeql-coding-standards#1041

Merged

paldepind marked this pull request as ready for review February 16, 2026 13:46

Copilot AI review requested due to automatic review settings February 16, 2026 13:46

paldepind requested a review from a team as a code owner February 16, 2026 13:46

paldepind requested a review from MathiasVP February 16, 2026 13:46

Copilot started reviewing on behalf of paldepind February 16, 2026 13:47 View session

Copilot AI reviewed Feb 16, 2026

View reviewed changes

paldepind added the no-change-note-required This PR does not need a change note label Feb 16, 2026

MathiasVP reviewed Feb 20, 2026

View reviewed changes

C++: Improve clarity in comment

fdbd49a

Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>

jketema reviewed Feb 20, 2026

View reviewed changes

paldepind and others added 2 commits February 20, 2026 16:24

C++: Fix typo

8eed18a

Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>

Merge branch 'main' into cpp/simple-range-analysis-phi-divide

9228304

jketema approved these changes Feb 20, 2026

View reviewed changes

jketema merged commit 8947f7a into github:main Feb 20, 2026
20 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++: Divide number of bounds between branches for phi nodes#21329

C++: Divide number of bounds between branches for phi nodes#21329
jketema merged 6 commits intogithub:mainfrom
paldepind:cpp/simple-range-analysis-phi-divide

paldepind commented Feb 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

MathiasVP left a comment

Uh oh!

MathiasVP Feb 20, 2026

Uh oh!

paldepind Feb 20, 2026

Uh oh!

Uh oh!

jketema left a comment

Uh oh!

Uh oh!

jketema Feb 20, 2026

Uh oh!

paldepind Feb 20, 2026

Uh oh!

jketema Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

		// by the condition. In this case all lower bounds flow to `{ e1 }` and only
		// lower bounds that are smaller than `c` flow to `{ e2 }`.

Conversation

paldepind commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

MathiasVP left a comment

Choose a reason for hiding this comment

Uh oh!

MathiasVP Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

paldepind Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jketema left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jketema Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

paldepind Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

jketema Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

paldepind commented Feb 16, 2026 •

edited

Loading