Conversation
820212e to
a1cfe7d
Compare
a1cfe7d to
5a5d333
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR adds AI-powered spam moderation to the forum platform. The implementation introduces a waffle flag-controlled feature that automatically classifies new posts and comments as spam using an external AI service, and flags detected spam content for moderation.
- Adds
is_spamfield to Comment and CommentThread models with database migration - Implements AI moderation service that integrates with an external API for spam classification
- Creates audit logging infrastructure via ModerationAuditLog model to track all AI moderation decisions
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| forum/toggles.py | Adds new waffle flag namespace and toggle for AI moderation feature |
| forum/serializers/contents.py | Adds is_spam field to content serializer |
| forum/migrations/0005_comment_is_spam_commentthread_is_spam_and_more.py | Creates database migration for spam flag and audit log table |
| forum/backends/mysql/models.py | Adds is_spam field to Content model and creates ModerationAuditLog model |
| forum/backends/mysql/api.py | Implements MySQL backend methods for flagging/unflagging spam |
| forum/backends/mongodb/threads.py | Adds is_spam parameter to thread insert/update operations |
| forum/backends/mongodb/comments.py | Adds is_spam parameter to comment insert/update operations |
| forum/backends/mongodb/api.py | Implements MongoDB backend methods for flagging/unflagging spam |
| forum/api/threads.py | Integrates AI moderation into thread creation flow |
| forum/api/comments.py | Integrates AI moderation into comment creation flow |
| forum/ai_moderation.py | Core AI moderation service implementation with API integration |
| forum/admin.py | Adds admin interface for spam flag and moderation audit logs |
| forum/init.py | Version bump to 0.3.9 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5a5d333 to
ad5d296
Compare
mraman-2U
left a comment
There was a problem hiding this comment.
Test cases to be added for the new changes.
forum/ai_moderation.py
Outdated
| self.system_message = getattr( | ||
| settings, "AI_MODERATION_SYSTEM_MESSAGE", self.DEFAULT_SYSTEM_MESSAGE | ||
| ) | ||
| self.timeout = getattr(settings, "AI_MODERATION_TIMEOUT", 30) # seconds |
There was a problem hiding this comment.
Get both read timeout and connectiion timout from config use the same in the post request
forum/ai_moderation.py
Outdated
|
|
||
| def __init__(self): # type: ignore[no-untyped-def] | ||
| """Initialize the AI moderation service.""" | ||
| self.api_url = getattr(settings, "AI_MODERATION_API_URL", self.DEFAULT_API_URL) |
There was a problem hiding this comment.
Can we add None instead of taking test data as default values? Also keep None even for system message
| } | ||
|
|
||
| payload = { | ||
| "messages": [{"role": "user", "content": content}], |
There was a problem hiding this comment.
For content volume or size validation, is the a max char validation while getting the input in forums? If not, we have check for the max. token validation before calling XPert API. The max. token is the configurable one
There was a problem hiding this comment.
Will take it post MVP, have added the point to https://2u-internal.atlassian.net/browse/COSMO2-770
forum/ai_moderation.py
Outdated
|
|
||
| def _make_api_request(self, content: str) -> Optional[Dict[str, Any]]: | ||
| """ | ||
| Make API request to AI moderation service. |
There was a problem hiding this comment.
Update to API request to XPert Service
|
|
||
| classification = moderation_result.get("classification", "not_spam") | ||
| reasoning = moderation_result.get("reasoning", "No reasoning provided") | ||
| is_spam = classification in ["spam", "spam_or_scam"] |
There was a problem hiding this comment.
Ideally yeah, but as of now we only have spam_or_scam and I kept spam as another option.
So when we do add more classification types we can make this as Enum. I'll add this as a note here :- https://2u-internal.atlassian.net/browse/COSMO2-770
| if not self.ai_moderation_user_id: | ||
| raise ValueError("AI_MODERATION_USER_ID setting is not configured.") | ||
| backend.flag_content_as_spam(content_type, content_id) | ||
| backend.flag_as_abuse( |
There was a problem hiding this comment.
For missing AI MODERATION USER ID config, None is being used as user id, will the model support to take "None"?
There was a problem hiding this comment.
No, it won't support, hence the value error.
Posting of content will continue without any issue, just moderation won't happen if user is not defined.
There was a problem hiding this comment.
Are we not storing with "None" as user id in DB ?
There was a problem hiding this comment.
We can allow, but as flag abuse also needs a user, we raise a value error.
And allow the post flow to continue but not the AI moderation flow.
| return None | ||
| except ( | ||
| requests.RequestException, | ||
| requests.Timeout, |
There was a problem hiding this comment.
For timeout, may need to build retry logic. It will be slowing down the synchronous request. Moving to async with retry will serve better.
There was a problem hiding this comment.
Yes, that's good point.
I've added connection and read timeouts for now.
Adding retry logic and adding the calls to async can be done post MVP, work is logged https://2u-internal.atlassian.net/browse/COSMO2-770
| TODO:- | ||
| - Add content check for images | ||
| """ | ||
| return ai_moderation_service.moderate_and_flag_content( |
There was a problem hiding this comment.
Cache all the flagged messages such that there is no need to hit the XPert api for duplicate messages.
There was a problem hiding this comment.
That's a good point, it'll reduce our calls and overhead to Xpert API,
Have added this work item, https://2u-internal.atlassian.net/browse/COSMO2-770
| } | ||
| # Check if AI moderation is enabled | ||
| # pylint: disable=import-outside-toplevel | ||
| from forum.toggles import ( |
There was a problem hiding this comment.
why don't this imported at the top?
There was a problem hiding this comment.
Importing forum.toggles at top will bring openedx imports written in toggles file in the picture.
Which will fail our test cases. Hence it's written over here. It is fairly common to do it for such imports. However not that common that it does make us relax about it being present here.
This is already known and we do plan to improve the toggle call in other place instead of here.
| updated_at: models.DateTimeField[datetime, datetime] = models.DateTimeField( | ||
| auto_now=True | ||
| ) | ||
| is_spam: models.BooleanField[bool, bool] = models.BooleanField( |
There was a problem hiding this comment.
is db indexing required for this attribute? Assess this with existing indexes
ad5d296 to
065b1a8
Compare
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (1)
forum/ai_moderation.py:1
- [nitpick] Comment in toggles.py mentions 'discussions' but the implementation refers to 'forum' content. The terminology should be consistent with the codebase.
"""
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Description
Spammers have been spamming the edX forums in partner courses. This PR adds a feature of AI moderation. So manual efforts are reduced to remove the spam.
As of now it is just a MVP.
There are various changes supposed to be done on this.
Test Instructions
discussions.enable_ai_moderationRelated PRs and order of merging:-
Needs Migration command to be run
This PR adds new migration files so migration commands need to run if not run automatically
Verification
This should list
0.3.9as forum versionToDo Check list:
Known issues :-
Deadline :- ASAP
Jira Link
Apt GIF