-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Rust: Handle Deref trait in type inference and data flow
#20987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| /** | ||
| * Index assignments like `a[i] = rhs` are treated as `*a.index_mut(i) = rhs`, | ||
| * so they should in principle be handled by `referenceAssignment`. | ||
| * | ||
| * However, this would require support for [generalized reverse flow][1], which | ||
| * is not yet implemented, so instead we simulate reverse flow where it would | ||
| * have applied via the model for `<_ as core::ops::index::IndexMut>::index_mut`. | ||
| * | ||
| * The same is the case for compound assignments like `a[i] += rhs`, which are | ||
| * treated as `(*a.index_mut(i)).add_assign(rhs)`. | ||
| * | ||
| * [1]: https://github.com/github/codeql/pull/18109 | ||
| */ |
Check warning
Code scanning / CodeQL
Predicate QLDoc style Warning
baba061 to
5dda9be
Compare
eb1db27 to
e269016
Compare
| pragma[nomagic] | ||
| Type getACandidateReceiverTypeAtSubstituteLookupTraits( | ||
| string derefChain, boolean borrow, TypePath path | ||
| Type getANonPseudoCandidateReceiverTypeAt( |
Check warning
Code scanning / CodeQL
Missing QLDoc for parameter Warning
8266451 to
14037e4
Compare
14037e4 to
0c7b1d0
Compare
7880183 to
be3a16a
Compare
cc413c1 to
adcbfc8
Compare
dd62164 to
dc0c45b
Compare
f3bad27 to
c23d528
Compare
60d9084 to
de3a73e
Compare
5c70ae0 to
20ebba3
Compare
9190142 to
00b243a
Compare
452da63 to
2d8da40
Compare
2d8da40 to
a446415
Compare
a446415 to
dce21e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive support for Rust's Deref trait in both type inference and data flow analysis. The implementation enables proper resolution of method calls through implicit dereference chains and inserts appropriate data flow nodes for such operations.
Key changes:
- Introduces
DerefChainclass to track chains of implicit dereferences during type inference - Adds data flow support for implicit
derefcalls with synthetic nodes - Resolves numerous test expectations that were previously marked as
MISSING
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
shared/util/codeql/util/UnboundList.qll |
Makes getElement public to support DerefChain implementation |
shared/typeinference/codeql/typeinference/internal/TypeInference.qll |
Adds performance optimizations with new predicates and pragma annotations |
rust/ql/lib/codeql/rust/internal/typeinference/DerefChain.qll |
New file implementing deref chain logic using UnboundList |
| Test expectation files | Updates reflecting improved type inference and data flow (resolving MISSING cases) |
| Test source files | Updates comments to reflect resolved test cases (removing MISSING markers) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This PR adds general support for the
Dereftrait when resolving method calls, which means both supporting it when actually resolving method calls in the type inference library, but also inserting the implicitderefcalls in data flow.As usual, commit-by-commit review is encouraged.
Type inference
When resolving method calls, a set of candidate receiver types are constructed by repeatedly dereferencing the receiver using applicable
Derefimplementations. Before this PR, we had limited support, namely&(mut) T -> TandString -> str. In order to handle arbitrary chains of dereferences, we introduce a new classDerefChain, based on the sharedUnboundListlibrary, which records the chain ofderefcalls needed to resolve a method call.After having resolved a method call, type information may also have to flow backwards through the chain of dereferences. Example:
Support for this is implemented in the
inferMethodCallTypeSelfpredicate, where theDerefChainis applied in reverse order, peeling off the top element until the chain becomes empty.Data flow
A method call
x.m()with an implicit dereference desugars to(*Deref::deref(&x)).m(), so we need to add data flow nodes for&x,Deref::deref(&x), and*Deref::deref(&x), as well as the implicit call toDeref::deref. This means we will have a reference store-step fromxto&xand a reference read-step fromDeref::deref(&x)to*Deref::deref(&x).The three different data flow nodes are represented by a state called
ImplicitDerefNodeState, and since we need to support arbitrary dereference chains, each synthetic node is additionally tagged with aDerefChainas well as an index into that chain.A small, but important, performance improvement is made when the targeted
derefmethod is one of the two built-in implementations; in this case, we can add a local flow step directly fromxtoDeref::deref(&x), which avoids the need for inter-procedural flow.Evaluation
The changes on this PR resolve a lot of
MISSINGtest expectations. DCA looks really great:Percentage of calls with call targetincreases by 2 % point, and as a consequence, we gain quite a lot of new results, all without regressing on performance. I also conducted a very positive QA experiment, which confirms the increase in alerts, e.g. 10 % morecleartext-loggingresults, 25 % morelog-injectionresults, and a staggering 100 % morepath-injectionresults.