Skip to content

fix: optimize mongo query for user vote list#19

Open
jcapphelix wants to merge 3 commits intorelease-ulmofrom
cjoshi/fix-query
Open

fix: optimize mongo query for user vote list#19
jcapphelix wants to merge 3 commits intorelease-ulmofrom
cjoshi/fix-query

Conversation

@jcapphelix
Copy link
Collaborator

@jcapphelix jcapphelix commented Mar 16, 2026

Description

Removal of ENABLE_FORUM_V2 flag from edx-platform makes all discussion / forum calls route to this repo. This started giving errors in production with this PR.

It gave a rise to "Slow Queries" in production. (DD reference link)

Reason :- Forum repo calls for entire mongo DB to iterate and find list of upvoted list of ids :- (Forum repo call link)

Proof :-

We added monitoring in another forum branch :- changes in branch :- dd-query-count

Datadog dashboard link :- Forum dashboard

image

Traces :-

image

Stage Mongo DB SS :-

image

Prior to this optimization, the method called .get_list() without any query params, that means it will call entire mongo collection of contents in cs_comments_service mongo db. And then traverse all the results, look into each of them if current user upvoted that thread or not.

This PR optimizes that call. Instead of querying entire thing and traversing, what we are doing is passing course_id and passing vote method as up or down and also passing user_id thus, only getting the IDs that we are looking for.

Jira ticket

https://2u-internal.atlassian.net/browse/COSMO2-847

Relevant Slack Thread

https://twou.slack.com/archives/C048NH9K5BN/p1773415479461749

@jcapphelix jcapphelix changed the title Cjoshi/fix query fix: optimize mongo query for user vote list Mar 16, 2026
@jcapphelix jcapphelix marked this pull request as ready for review March 16, 2026 10:48
val["threads"],
val["responses"],
val["replies"],
val["username"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this the unrelated test change? I'm guessing this was a flaky test that you're fixing here. If so, can you ensure this ends up in the squash commit's message?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was unrelated test-case. Yes, will make sure to add it in squash commit message.

content_query: dict[str, Any] = {}
if course_id:
content_query["course_id"] = str(course_id)
content_query[f"votes.{vote}"] = {"$in": [user_id, str(user_id)]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why this is [user_id, str(user_id)]? I see that user_id is already a string so it looks like this would be equivalent to [user_id, userId].

My understanding of MongoDB query syntax is a little hazy, but it seems like we could simplify this:

Suggested change
content_query[f"votes.{vote}"] = {"$in": [user_id, str(user_id)]}
content_query[f"votes.{vote}"] = user_id

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check it in that manner.


contents = content_model.get_list(**content_query)
voted_ids = []
for content in contents:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can any of this be removed now that the query is more refined?

def get_user_voted_ids(cls, user_id: str, vote: str) -> list[str]:
def get_user_voted_ids(
cls, user_id: str, vote: str, course_id: Optional[str] = None
) -> list[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan or ticket to optimize this by including course ID in the query?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To be clear, I'm not saying we have to do this. If 2U isn't using the MySQL backend, we could just pass this info along to someone who is.)

if vote not in ["up", "down"]:
raise ValueError("Invalid vote type")

content_model = Contents()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add test coverage for these changes? (Not sure if this repo is set up with a mock DB for unit tests.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants