-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query: remove nav expansion as a way to tackle the pending selector problem #32957
Comments
Just to note that a main reason to solve the pending selector problem is performance: because we defer Select(), we end up duplicating the same SQL (e.g. subqueries) across multiple operators rather than having it just once in the query. It may be possible to keep nav expansion in its current form but to stop deferring selectors, but we know we already want to try to get rid of nav expansion as a separate phase anyway (for the above other reasons).
Examples of these include the inability to properly process ExecuteUpdate/Delete since they're relational (see #32493). Similarly, provider-specific LINQ operators such as DistinctBy (directly translatable on PostgreSQL) cannot be handled. Finally, it's worth noting that nav expansion was written in a perf-suboptimal way: in order to avoid all visitor state, it visits the tree multiple times, first wrapping everything in "state nodes", doing its job, and then unwrapping them back again to get a normal LINQ expression tree. This adversely affects query compilation performance, which we're starting to pay a bit more attention to.
I'm personally not really worried about this - I think it's a question of factoring the logic correctly into the translation phase. In fact, I believe that the splitting up of navigation handling into the separate pre-processing phase adds much more complexity than it saves, and doing it in the right way inside translation could potentially make it much simpler. Breaking a thing into two passes really isn't necessarily a great way to make that thing simpler.
IMHO we should not have any sort of model awareness in preprocessing - preprocessing really should be concerned with basic normalization and operations working only the LINQ expression tree shape; all model knowledge should happen at the translation phase only. This is because of provider extensibility, and also because tracking which node corresponds to which model thing (e.g. binding properties) is very non-trivial, and we shouldn't need to do it twice (both in preprocessing and translation). So at least in my ideal mental model, any processing that needs to be aware of the model should move to translation, just like nav expansion - compared to the complexity of nav expansion, I don't necessarily foresee a huge amount of complexity there. One motivating factor for having nav expansion in pre-processing, was that providers get this logic for free rather than needing to implement it (e.g. conversion of enumerable LINQ operators to queryable, query filter integration...). We should try to move this universal logic to core via other means during translation (e.g. do enumerable->queryable in QueryableRelationalExpressionVisitor), rather than as a preprocessing pass as we currently do. |
When this is done, see if #33621 is also solved |
Another problematic query tree mangling that nav expansion performs... Am trying to make the following test pass; the important part is ElementAt(0) over a list of Orders (structural types): public virtual Task ElementAt_over_owned_collection(bool async)
=> AssertQuery(
async,
ss => ss.Set<OwnedPerson>().Where(p => p.Orders.ElementAt(0).Id == -11)); p.Orders itself translates well, to a ShapedQueryExpression that wraps what I call a "bare array"; at this point of the translation, no operation (e.g. Where) has been applied to the array, which means that we can do various specialized translations. For example, the above should translate to But of course, nav expansion helpfully decides to move the .Id before the ElementAt() (or the ElementAt() after the Id?), so the fragment coming out of nav expansion is: Since Select() has been applied before ElementAt(), we no longer have a bare array. |
Another one... Given a simple SelectMany without a result selector: public virtual Task Column_collection_SelectMany(bool async)
=> AssertQuery(
async,
ss => ss.Set<PrimitiveCollectionsEntity>().SelectMany(c => c.Ints)); Nav expansion produces the following tree:
In other words, a result selector is added, projecting out to a TransparentIdentifier, only to then compose a Select() which then gets rid of the TransparentIdentifier. It's a transformation that produces a mathematically equivalent but more complex tree. The above can be pattern-matched and simplified back, but of course there's further interference. For example: public virtual Task SelectMany_without_result_selector_over_owned_collection(bool async)
=> AssertQuery(
async,
ss => ss.Set<OwnedPerson>().SelectMany(p => p.Orders).Where(o => o.Id > -30)); Gets transformed into:
i.e. the Select() is moved forward, making it impossible to simplify back. |
See #20291 for additional context.
Basically the idea is to retire nav expansion as a way of dealing with navigations in our query pipeline. Instead, the work would be done in translation phase. This is good because:
NavigationExpansionExtensibilityHelper
),On the flipside, translation phase is already very complex, and this will add even more complexity, UNLESS some things can be simplified.
Also, nav expansion currently is is performing other tasks, e.g. reasons about primitive collections - this is because it's the only place in pre-processing where we have access to the model. So all preprocessing needing model information is performed there.
We either need to keep some pre-processing based on model where all those tasks can be performed (and maybe make it extensible for providers?) or move some of that to translation. @roji
The text was updated successfully, but these errors were encountered: