-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
perf: setQueryData O(n²) performance bottleneck and optimize replaceEqualDeep for large datasets #9392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
https://github.com/TanStack/query/blob/main/CONTRIBUTING.md#commit-message-format IMO, above docs can help you naming PR title! |
@@ -250,11 +250,28 @@ export function partialMatchKey(a: any, b: any): boolean { | |||
* If not, it will replace any deeply equal children of `b` with those of `a`. | |||
* This can be used for structural sharing between JSON values for example. | |||
*/ | |||
// WeakMap for memoization to avoid processing the same objects multiple times | |||
const replaceCache = new WeakMap<object, WeakMap<object, any>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a misguided point due to my limited understanding, but I have some concerns about using replaceCache as you suggested:
-
I understand that replaceEqualDeep is a utility function that returns a copy of b, but returns a if a and b are deeply equal.
-
From what I know, replaceEqualDeep is used when replacing existing data in a query with new data. So, I think it would be rare for existing data to change to different values multiple times.
For example, if existing data changes from A to B as a result of queryFn, in most cases the reference to the previous A would be lost, making caching meaningless. -
As mentioned in point 1, replaceEqualDeep is used to replace existing data in a query with new data. Even if we receive values with the same structure through fetching in most cases, the references of the two values being compared would be different. Therefore, it seems rare to call get(b) again on the weakMap returned from replaceCache.get(a).
-
I believe utility functions should maintain purity. If multiple queryClients are running, replaceCache operations from one queryClient might affect the behavior of another queryClient. (Of course, based on the current implementation, affecting other clients doesn't seem to cause major issues.) So, for replaceCaching, I think it would be better to create separate replaceCache objects for each queryClient or queryObserver rather than using a utility function.
If my understanding is incorrect or if you have different thoughts, please let me know your feedback :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thoughtful feedback! You’re absolutely right about the purity concerns and global state issues.
However, I believe there’s still value here based on real usage patterns - optimistic updates, form synchronization, and incremental data changes are extremely common and would benefit from this optimization. The performance impact in issue #1633 (500ms → 50ms) demonstrates real user pain points.
Let me revise the approach: remove the global WeakMap and implement QueryClient-scoped memoization as an opt-in feature. This addresses your architectural concerns while still providing performance benefits for users who need them.
What do you think about this compromise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue you mentioned has already been resolved. The setQueryData is already using get instead of find.
(PR that removed find from setQueryData)
If test cases were included, we could verify that performance improvements can be achieved through the code in the PR.
Also, I don't think the WeakMap-based cache would work even with optimistic updates, form synchronization, and incremental data changes.
As I mentioned in my previous comment, it's hard to imagine fetching an object with the same reference value as a discarded object that has already been updated.
(Though I suppose it might be meaningful when applying the same reference-valued object created on the client to replaceEqualDeep multiple times)
@ss0526100 You're absolutely right - the core performance issues were already resolved in PR #6918, and my proposed WeakMap optimizations would add complexity without meaningful real-world benefits given typical object reference patterns. |
Optimized setQueryData performance by eliminating O(n) cache lookups and improving replaceEqualDeep algorithm for large datasets. (Fixes #6783, #1633)
QueryCache Optimization
replaceEqualDeep Algorithm Improvements
Memory Management