-
Notifications
You must be signed in to change notification settings - Fork 348
[ENG-5862] SPAM - Fix Wiki Spamming #11171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENG-5862] SPAM - Fix Wiki Spamming #11171
Conversation
The issue described in the ticket is not entirely accurate. I found the domain "xawk1.com" on staging, and content containing this domain will not be banned even if the project is public (here is an example). And on the other side, if you repeat the steps from the ticket using some other domains, everything works as expected. However, the spam check will always be triggered because the
Probably changes made by |
…eck_resource_for_domains_postcommit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antkryt I'm a little confused exactly what this accomplishes, could you write some test cases to illustrate how you are changing the current behavior. I see that the tests are changed, but I don't see new test cases. A good test case here should show the spam content or spam domain of a wiki is checked when a project is made public.
@Johnetordoff I'm not sure that it's possible to write a test case to illustrate what I'm changing here, so I'll try to explain better. Both As you can see, tasks don't know anything about each other and changes that are made (it's two different parallel processes). So, if Also, I've changed signature of the As for checking spam during privacy change tests, we have bunch of those (see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not quite the behavior we are looking for, but correct techniques.
|
||
@mock.patch.object(settings, 'SPAM_SERVICES_ENABLED', True) | ||
@mock.patch('osf.external.spam.tasks._check_resource_for_domains') | ||
def test_check_resource_for_spam_postcommit_with_spammy_domains(self, mock_check_domains, project, user): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THe ticket instructions to reproduce the bug are:
- Create project
- Update wiki with spammy domain
- Make project public
These are the steps the tests should follow, you are testing behavior that is too specific we know checking for spam domans works, when a project is saved when public, but we need to check a project's wiki content too, not just it's desciption etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good new tests, thanks!
5af4c80
into
CenterForOpenScience:feature/pbs-25-13
…cience/osf.io into add-brand-to-collection-provider * 'feature/pbs-25-13' of https://github.com/CenterForOpenScience/osf.io: (25 commits) [ENG-8224] Fixed force archive template with registration addons (CenterForOpenScience#11210) API: Allow /v2/users/me/preprints list view to filter by title [ENG-8325] Public column does not display the visibility status of child nodes on the Nodes page in the Admin App [ENG-8246] Fixed deletion of maintenance alerts in admin (CenterForOpenScience#11226) add exception handling to /review_actions/ endpoint added a route to download node metadata (CenterForOpenScience#11215) [ENG-7929] Ability to move registrations to draft state (CenterForOpenScience#11153) switch to new UI when user views draft registration file (CenterForOpenScience#11144) [ENG-5862] SPAM - Fix Wiki Spamming (CenterForOpenScience#11171) improved displaying of stashed urls and approval state in admin (CenterForOpenScience#11193) [ENG-7962] Fix User Setting Response Payload async mailchimp perference change issues (CenterForOpenScience#11136) fixed children/parent fields in admin templates (CenterForOpenScience#11156) don't add multiple group perms for preprint provider (CenterForOpenScience#11159) fix content overflow for node page (CenterForOpenScience#11182) [ENG-8096] Admins on projects are unable to reject user access requests (CenterForOpenScience#11163) added retry to avoid race condition (CenterForOpenScience#11179) upgrade django to 4.2.17 (CenterForOpenScience#11173) add additional information to user admin (CenterForOpenScience#11184) [ENG-8192] Ability to force archive registrations when OSFS Folders have changed (CenterForOpenScience#11194) [ENG-8193] Fix issues with Preprint submission via API (CenterForOpenScience#11185) ...
…cience/osf.io into refactor-notifications * 'feature/pbs-25-13' of https://github.com/CenterForOpenScience/osf.io: (33 commits) added academiaInstitution in social-schema, fixed True value of 'ongoing', fixed/added tests (CenterForOpenScience#11239) [ENG-7979] Registrations pending moderation that have components also pending moderation do not display the children (CenterForOpenScience#11222) [ENG-8216] Fixed children deletion on a node page in admin (CenterForOpenScience#11237) [ENG-8401] Fixed preprint downloading (CenterForOpenScience#11238) fix categories for sendgrid emails (CenterForOpenScience#11236) [ENG-8936] API: Allow /v2/users/me/preprints list view to filter by tags (CenterForOpenScience#11232) Add collections scopes to FULL_READ and FULL_WRITE add brand relationship to collectionprovider [ENG-8224] Fixed force archive template with registration addons (CenterForOpenScience#11210) API: Allow /v2/users/me/preprints list view to filter by title [ENG-8325] Public column does not display the visibility status of child nodes on the Nodes page in the Admin App [ENG-8246] Fixed deletion of maintenance alerts in admin (CenterForOpenScience#11226) add exception handling to /review_actions/ endpoint added a route to download node metadata (CenterForOpenScience#11215) [ENG-7929] Ability to move registrations to draft state (CenterForOpenScience#11153) switch to new UI when user views draft registration file (CenterForOpenScience#11144) [ENG-5862] SPAM - Fix Wiki Spamming (CenterForOpenScience#11171) improved displaying of stashed urls and approval state in admin (CenterForOpenScience#11193) [ENG-7962] Fix User Setting Response Payload async mailchimp perference change issues (CenterForOpenScience#11136) fixed children/parent fields in admin templates (CenterForOpenScience#11156) ... # Conflicts: # tests/test_user_profile_view.py
…cience/osf.io into add-new-notifications-data-model * 'feature/pbs-25-13' of https://github.com/CenterForOpenScience/osf.io: (35 commits) [ENG-8462] Institution setup fixes (CenterForOpenScience#11241) [ENG-8401] Earlier preprint versions download the current file (CenterForOpenScience#11245) added academiaInstitution in social-schema, fixed True value of 'ongoing', fixed/added tests (CenterForOpenScience#11239) [ENG-7979] Registrations pending moderation that have components also pending moderation do not display the children (CenterForOpenScience#11222) [ENG-8216] Fixed children deletion on a node page in admin (CenterForOpenScience#11237) [ENG-8401] Fixed preprint downloading (CenterForOpenScience#11238) fix categories for sendgrid emails (CenterForOpenScience#11236) [ENG-8936] API: Allow /v2/users/me/preprints list view to filter by tags (CenterForOpenScience#11232) Add collections scopes to FULL_READ and FULL_WRITE add brand relationship to collectionprovider [ENG-8224] Fixed force archive template with registration addons (CenterForOpenScience#11210) API: Allow /v2/users/me/preprints list view to filter by title [ENG-8325] Public column does not display the visibility status of child nodes on the Nodes page in the Admin App [ENG-8246] Fixed deletion of maintenance alerts in admin (CenterForOpenScience#11226) add exception handling to /review_actions/ endpoint added a route to download node metadata (CenterForOpenScience#11215) [ENG-7929] Ability to move registrations to draft state (CenterForOpenScience#11153) switch to new UI when user views draft registration file (CenterForOpenScience#11144) [ENG-5862] SPAM - Fix Wiki Spamming (CenterForOpenScience#11171) improved displaying of stashed urls and approval state in admin (CenterForOpenScience#11193) ... # Conflicts: # admin/nodes/views.py # api_tests/users/views/test_user_settings_detail.py # framework/email/tasks.py # tests/framework_tests/test_email.py # tests/test_user_profile_view.py
* handle django core validation error in drf view * add unit tests * add registration_word for registration provider * [ENG-8290] Allow collection search POST with token scope (#11201) Purpose Add scope for POST collection search * [ENG-8247] Ability to delete draft preprints from database (#11191) ## Purpose User should be able to delete draft preprints, so that we don't save them in database ## Changes Added `delete` method. User can delete a preprint only if it's in initial state, so it means this preprint is a draft ## Ticket https://openscience.atlassian.net/browse/ENG-8247 * [ENG-8193] Fix issues with Preprint submission via API (#11185) ## Purpose handle django core ValidationError in drf view ## Ticket https://openscience.atlassian.net/browse/ENG-8193 * [ENG-8192] Ability to force archive registrations when OSFS Folders have changed (#11194) ## Purpose fix force archive for some actions ## Changes - divide revert_log_actions into separate functions - include trashed children when build a file tree - add additional retrieval logic for osf_storage_folder_created and osf_storage_file_removed - add generic retrieval fallback ## Ticket https://openscience.atlassian.net/browse/ENG-8192 * add additional information to user admin (#11184) ## Purpose add more information to user admin ## Changes Added fields: - id - Deactivation request - Change password last attempt - Change password invalid attempts - Socials - User is staff - Groups Added dates to: - Disabled - Registered - Confirmed ## Ticket https://openscience.atlassian.net/browse/ENG-8190 * upgrade django to 4.2.17 (#11173) ## Purpose Because Django 4.2.15 version has a vulnerability, it was upgraded to 4.2.17 ## Changes Updated pyproject.toml and lock file ## Ticket https://openscience.atlassian.net/browse/ENG-8176 * added retry to avoid race condition (#11179) ## Purpose User received an email of archive failure regardless of it was successful. It may happen because `archive_success` is run asynchronously https://github.com/CenterForOpenScience/osf.io/blob/2328dd60f55e9c1281dcb29dcd45a78a7fd2cc5f/website/archiver/listeners.py#L33-L49 and may be finished before the main thread finishes archiving process. So at first `archive_success` is processed and no files found, thus email is sent, then the main thread finishes files processing and this archiving is successful actually. Also it's possible that the celery queue had too many tasks to process and when the main thread finishes archiving, user sees his registration and when celery processes `archive_success` tasks that fails, user receives this email. ## Changes Added a one-time retry. ## Ticket https://openscience.atlassian.net/browse/ENG-8175?atlOrigin=eyJpIjoiMjg4MWM1YWI1ZTE3NDMyZmEyODk2Y2QxZjlhNjFlOGQiLCJwIjoiaiJ9 * [ENG-8096] Admins on projects are unable to reject user access requests (#11163) ## Purpose Add default value when reject access request ## Ticket https://openscience.atlassian.net/browse/ENG-8096 * fix content overflow for node page (#11182) ## Purpose fix content overflow on the node admin page ## Ticket https://openscience.atlassian.net/browse/ENG-8063 * don't add multiple group perms for preprint provider (#11159) ## Purpose don't add multiple group perms for preprint provider ## Changes - check if user already belongs to some provider group - change checkbox to radio select ## Ticket https://openscience.atlassian.net/browse/ENG-8016 * fixed children/parent fields in admin templates (#11156) ## Purpose Admin templates used nonexistent fields to display children and parent of a node (potentially new fields were added but the old fields weren't replaced by new ones). The templates used `parent` and `children` fields. However an endpoint that adds children and parent to a node uses fields `descendants` (or through `NodeRelation` using `get_nodes` method) and `parent_node` property through `_parents` field ## Changes Use correct fields ## Notes Deletion of children nodes that are displayed now is broken. Together with Mark decided to create a separate ticket for this issue ## Ticket https://openscience.atlassian.net/browse/ENG-7969 * [ENG-7962] Fix User Setting Response Payload async mailchimp perference change issues (#11136) ## Purpose Correct issues where PATCH responses wouldn't show preferences as updated due to async mailchimp task execution. Overrides instance model to return changed value, even though db value is unchanged until mailchimp confirms. ## Changes - adds special case for returning desired value synchronously - removes `update_osf_help_mails_subscription` because it's not used, but tested. - adds test cases ## Ticket https://openscience.atlassian.net/browse/ENG-7962 * improved displaying of stashed urls and approval state in admin (#11193) ## Purpose Admins feel discomfort while looking for tokens in stashed urls and approval state fields on a registration page in admin ## Changes Format fields in a json-like way ## Ticket https://openscience.atlassian.net/browse/ENG-6614?atlOrigin=eyJpIjoiMTdkMTM0YzBmNjljNDcxYTkxNjYzZDRjYTY3NjEyODMiLCJwIjoiaiJ9 * [ENG-5862] SPAM - Fix Wiki Spamming (#11171) <!-- Before submit your Pull Request, make sure you picked the right branch: - For hotfixes, select "master" as the target branch - For new features, select "develop" as the target branch - For release feature fixes, select the relevant release branch (release/X.Y.Z) as the target branch --> ## Purpose verify if spammy domains are detected ## Changes - merge check_resource_for_domains_postcommit and check_resource_with_spam_services tasks to avoid race condition - compare note to value not enum - log detected domains to sentry ## QA Notes You can test it with domain [xakw1.com](https://admin.staging3.osf.io/admin/osf/notabledomain/1180/change/?_changelist_filters=q%3Dxakw1.com) on staging3. Currently project won't be banned with this domain, regardless of whether it's public or not. ## Ticket https://openscience.atlassian.net/browse/ENG-5862 * switch to new UI when user views draft registration file (#11144) ## Purpose Throughout registration creation user can view attached files. However `resolve_guid` doesn't handle this case and renders the legacy UI that shows error because `get_rdf_type` raises `NotImplementedError`. So this case has no be handled by ember. However when we switch to the new UI, we get `draftnode is not a supported target type`. Also there will be some work for FE because for now ember relies on `node` relationship that draft node/registration don't have. Taking into account Futa's words, it's an unusual flow for ember ## Changes Added redirect to ember, updated `view_map` that referenced to the non-existing view `draft_nodes:node-detail` to use `draft_nodes:draft-node-detail` view ## Notes 1. I'm not confident that `draft-node` key is still used in `view_map`. From the `resolve` method of `TargetField` I see that keys are either model name in lowercase or `referent._name` that I couldn't find how is created (maybe automatically). However if it existed, there would be some errors that this view does not exist. So I left both keys but with the correct view name. In case you know the answer, I can remove the original key and leave only the correct one ## Ticket https://openscience.atlassian.net/jira/software/c/projects/ENG/boards/145?selectedIssue=ENG-5810 * [ENG-7929] Ability to move registrations to draft state (#11153) ## Purpose Admins should be able to revert registration to draft state for the support purposes ## Changes Added functionality to revert registration back to draft state Added a button in admin to revert a registration Added tests ## Notes: Currently we are discussing with Mark what to do with a registration after admin reverts it, so this ticket cannot be merged. The issue is that when admin reverts registration, it's still displayed for admin, just with deletion date and when user re-registers a draft, the system creates a new registration and the draft is linked to this newest version and admin sees both registrations. So if admin tries to revert the previous version, he'll get an error because the version doesn't have the linked draft. What I've suggested: 1. Fully delete previous registered version so that you won't be able to see it in admin. In this way if the registered version had guid 123ab and user registered the restored draft again, the draft would have another registration with another guid, like 999ss . Basically for user it'll be the same registration, just with another url to access it 2. Add another (or the same) hint for Revert to Draft button that you can't revert this previous version (the easiest solution) 3. One more solution, that I think will be the most complicated between these three, is that we can try to save the guid of an original registered version and whenever admin reverts registration and user registers it again from the same draft, we assign the saved guid to a new registration **DECISION**: Option 2 is the most reliable. In case we want #1 or #3 behavior, a new ticket will be created ## Ticket https://openscience.atlassian.net/browse/ENG-7929 * added a route to download node metadata (#11215) ## Purpose So right now, our firewall rules aren't advanced enough to be able to let requests from `/<guid>/metadata` go through to osf on staging4. Instead, we have to start with a stable string, and do the dynamic part afterward, hence doing `/metadata/<guid>/` which we can match with `/metadata/*` in the firewall. ## Changes Added a corresponding route ## Ticket https://openscience.atlassian.net/browse/ENG-8261 * add exception handling to /review_actions/ endpoint * [ENG-8246] Fixed deletion of maintenance alerts in admin (#11226) What When you try to delete the active maintenance banner from the admin app, it 502s. The traceback says that there is no url to redirect to and to provide a success url. Acceptance Criteria A user will be able to add a maintenance banner and then delete it. When the banner is added, it will show on the OSF dashboard page (among other places). When deleted, it will no longer show. * [ENG-8325] Public column does not display the visibility status of child nodes on the Nodes page in the Admin App Public column does not display the visibility status of child nodes on the Nodes page in the Admin App * API: Allow /v2/users/me/preprints list view to filter by title Allow the User Preprints endpoint (/v2/users/me/preprints) to be filterable by title (?filter[title]=example), similar to how the User Nodes endpoint (/v2/users/me/nodes) supports this filter capability * [ENG-8224] Fixed force archive template with registration addons (#11210) Enable Product Team to Force Archive Registrations in the Admin App * add brand relationship to collectionprovider * Add collections scopes to FULL_READ and FULL_WRITE * [ENG-8936] API: Allow /v2/users/me/preprints list view to filter by tags (#11232) * fix categories for sendgrid emails (#11236) * [ENG-8401] Fixed preprint downloading (#11238) * fixed preprint downloading * fixed nonetype * [ENG-8216] Fixed children deletion on a node page in admin (#11237) * field children deletion in admin * improved performance and removed unused attributes * [ENG-7979] Registrations pending moderation that have components also pending moderation do not display the children (#11222) * show only public nodes for non-authorized users in the node queryset * add custom filters to can_view(); include pending nodes for moderators in get_node_count() * add test * added academiaInstitution in social-schema, fixed True value of 'ongoing', fixed/added tests (#11239) ## Purpose V2 API doesn't allow setting `True` value for `ongoing` property in employment/education tabs and set `academiaInstitution` property in social tab ## Changes Added `academiaInstitution` field in social-schema Fixed ignored required properties for `ongoing` property in employment/education-schema files Added new tests and fixed the old ones ## QA Notes Can be tested only via API Updates can be viewed in user settings ## Ticket https://openscience.atlassian.net/browse/ENG-8455 * [ENG-8401] Earlier preprint versions download the current file (#11245) * fixed earlier version download the newest version file * fixed tests * [ENG-8462] Institution setup fixes (#11241) * create monthly reports when institution is created * better exception handling; async generate report * fix test * handle deactivated institutions; minor fixes * [Reopen] [ENG-8192] Ability to force archive registrations when OSF Folders have changed (#11247) * break force_archive into logical parts; support trashed children * add unit tests; minor fixes * fix name collisions issue and general fallback * add test * minor fixes --------- Co-authored-by: Anton Krytskyi <[email protected]> Co-authored-by: ihorsokhanexoft <[email protected]> Co-authored-by: John Tordoff <[email protected]> Co-authored-by: futa-ikeda <[email protected]>
Purpose
verify if spammy domains are detected
Changes
QA Notes
You can test it with domain xakw1.com on staging3. Currently project won't be banned with this domain, regardless of whether it's public or not.
Documentation
Side Effects
Ticket
https://openscience.atlassian.net/browse/ENG-5862