fix: optimize mongo query for user vote list#19
fix: optimize mongo query for user vote list#19jcapphelix wants to merge 3 commits intorelease-ulmofrom
Conversation
| val["threads"], | ||
| val["responses"], | ||
| val["replies"], | ||
| val["username"], |
There was a problem hiding this comment.
Was this the unrelated test change? I'm guessing this was a flaky test that you're fixing here. If so, can you ensure this ends up in the squash commit's message?
There was a problem hiding this comment.
Yes, this was unrelated test-case. Yes, will make sure to add it in squash commit message.
| content_query: dict[str, Any] = {} | ||
| if course_id: | ||
| content_query["course_id"] = str(course_id) | ||
| content_query[f"votes.{vote}"] = {"$in": [user_id, str(user_id)]} |
There was a problem hiding this comment.
Can you help me understand why this is [user_id, str(user_id)]? I see that user_id is already a string so it looks like this would be equivalent to [user_id, userId].
My understanding of MongoDB query syntax is a little hazy, but it seems like we could simplify this:
| content_query[f"votes.{vote}"] = {"$in": [user_id, str(user_id)]} | |
| content_query[f"votes.{vote}"] = user_id |
There was a problem hiding this comment.
Let me check it in that manner.
|
|
||
| contents = content_model.get_list(**content_query) | ||
| voted_ids = [] | ||
| for content in contents: |
There was a problem hiding this comment.
Can any of this be removed now that the query is more refined?
| def get_user_voted_ids(cls, user_id: str, vote: str) -> list[str]: | ||
| def get_user_voted_ids( | ||
| cls, user_id: str, vote: str, course_id: Optional[str] = None | ||
| ) -> list[str]: |
There was a problem hiding this comment.
Is there a plan or ticket to optimize this by including course ID in the query?
There was a problem hiding this comment.
(To be clear, I'm not saying we have to do this. If 2U isn't using the MySQL backend, we could just pass this info along to someone who is.)
| if vote not in ["up", "down"]: | ||
| raise ValueError("Invalid vote type") | ||
|
|
||
| content_model = Contents() |
There was a problem hiding this comment.
Can we add test coverage for these changes? (Not sure if this repo is set up with a mock DB for unit tests.)
Description
Removal of
ENABLE_FORUM_V2flag fromedx-platformmakes all discussion / forum calls route to this repo. This started giving errors in production with this PR.It gave a rise to "Slow Queries" in production. (DD reference link)
Reason :- Forum repo calls for entire mongo DB to iterate and find list of upvoted list of ids :- (Forum repo call link)
Proof :-
We added monitoring in another forum branch :- changes in branch :- dd-query-count
Datadog dashboard link :- Forum dashboard
Traces :-
Stage Mongo DB SS :-
Prior to this optimization, the method called
.get_list()without any query params, that means it will call entire mongo collection ofcontentsincs_comments_servicemongo db. And then traverse all the results, look into each of them if current user upvoted that thread or not.This PR optimizes that call. Instead of querying entire thing and traversing, what we are doing is passing
course_idand passingvotemethod asupordownand also passinguser_idthus, only getting the IDs that we are looking for.Jira ticket
https://2u-internal.atlassian.net/browse/COSMO2-847
Relevant Slack Thread
https://twou.slack.com/archives/C048NH9K5BN/p1773415479461749