feat: enhance MongoDB index checker with detailed sharding guidelines and common patterns#25
Open
mfori wants to merge 2 commits into
Open
feat: enhance MongoDB index checker with detailed sharding guidelines and common patterns#25mfori wants to merge 2 commits into
mfori wants to merge 2 commits into
Conversation
… and common patterns
mtrunkat
approved these changes
May 29, 2026
mvolfik
reviewed
May 29, 2026
Member
mvolfik
left a comment
There was a problem hiding this comment.
i have no idea how to review this, nothing here seems factually wrong, don't know what else is expected from a prompt review
Member
Author
I think that's enough and we will see and can modify it later. You're on review mainly to be aware of this :) |
valekjo
approved these changes
Jun 3, 2026
Member
valekjo
left a comment
There was a problem hiding this comment.
Looks good, some tiny suggestions, and I'm not even very sure about those :D
| 1. **Multi-shard collections** — the same collection's data is split across multiple physical shards via a shard key. Identify these by reading `src/packages/mongo-connection/src/mongo_connection.ts`: any field typed as `ShardAwareCollection<TSchema, TShardKeys>` is multi-shard. The second generic argument lists the shard-key fields. | ||
| 2. **Single-shard placement** — the collection lives on a non-default physical shard (no shard key, no chunks across shards, just placed on a different machine). In `mongo_connection.ts` these fields are typed `MovedCollection<TSchema, 'shard-N'>` (the shard tag is the second generic argument); the authoritative placement map is `SHARD_PLACEMENT` in `src/packages/mongo-connection/src/shard_placement.ts`. Collections absent from that map live on the default shard (`shard-0`). | ||
|
|
||
| Read those two files to determine each collection's sharding kind. The prompt deliberately doesn't list specific collections — the set evolves over time, and the source files are authoritative. |
Member
There was a problem hiding this comment.
Nit: This seems extra.
Suggested change
| Read those two files to determine each collection's sharding kind. The prompt deliberately doesn't list specific collections — the set evolves over time, and the source files are authoritative. | |
| Read those two files to determine each collection's sharding kind. The set evolves over time, and the source files are authoritative. |
|
|
||
| Flag only when ALL of the following hold: | ||
| - The collection is multi-shard (per `mongo_connection.ts`). | ||
| - At least one shard-key field is *entirely absent* from the filter object — not just `undefined`, *missing*. |
Member
There was a problem hiding this comment.
Question: Isn't this already covered by the types?
| Flag when these run on a multi-shard collection without `readConcern: 'available'`: | ||
| - `.skip(N)` / `$skip: N`. Suggest cursor pagination on the sort key as the better alternative; `readConcern: 'available'` is a fallback. 🟠 high. | ||
| - `.countDocuments(...)` on `.rawCollection` or on the underlying `Collection`. Note: `ShardAwareCollection.approximateCountDocuments()` already wraps `readConcern: 'available'` — **do not flag it**. 🟠 high. | ||
| - `aggregate` ending in `$count` or with wide `$group` over many keys. 🟠 high. |
Member
There was a problem hiding this comment.
Nit: I'm not sure that "wide $group" is specific enough. Maybe "low cardinality"?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There are false positives in evaluation of sharding rules, for example here: https://github.com/apify/apify-core/pull/28206#pullrequestreview-4372775926
This should hopefully perform better.
Note: waiting for https://github.com/apify/apify-core/pull/28238