Skip to content

[Repository] Limit/prevent spam issues/PRs #2258

@GuillaumeDua

Description

@GuillaumeDua

Motivation

  • Limit/constraint the amount of new irrelevant issues and PRs
  • Thus avoid unnecessary notifications and emails received by the ~2k watchers of this repository
  • Yet ensure the solution won't be an impediment for legit contributors

Context

Roughly checking on:

Some recurring characteristics emerge:

  • The account user was just created
  • Short issue/PR name: most likely, a single word
    • contributor name, possibly shortened
    • generic ("Git", "practice", "projects", "Cpp", "ccc", "Hello World", "Hi", "Halo", etc.)
  • Short issue/PR body
  • The issue/PR is the first one of this user, ever.
  • There's no previous nor further activity on the user's account.

And sometimes:

  • The contributor name is composed like <some_indian_name>_?\d+, but this cannot constitute a criterion in any way.

Suggested solutions

Turn on moderation settings

Disable blank issue/PR template

From my perspective, a (bunch of) dedicated template issue/PRs with many non-empty fields are a good way to avoid unnecessary issues and/or PRs.

Add a dedicated anti-spam github-action

Such a github-action would perform several checks, as acceptance criteria.

Technically it could rely on actions/github-script@v7 (and octokit) to use the github rest API.

The repo's example here automatically add a comment to PR created by new contributors.

We can then implement other criterions, as the one listed in the Context section above.

What if the checks does not pass ?

  • Either add a dedicated label suspicious to the issue or PR.
  • Or immediately rejects the issue/PR by closing it,
    possibly adding an automated comment describing the reason, and pointing to the "how to contribute" guideline.

Additional thoughts

  • Either solution might make the spammers always find new ways to meet such acceptance criterias and keep on spamming, over and over again ?

Activity

jwakely

jwakely commented on Feb 22, 2025

@jwakely
Contributor

It would be great if we can stop nonsense like #2257, #2241, #2254, #2252, #2249, #2238, #2225 etc. etc. etc.

added a commit that references this issue on Feb 23, 2025
GuillaumeDua

GuillaumeDua commented on Feb 23, 2025

@GuillaumeDua
Author

I made some experiments, available with the draft PR #2259 (disclaimer, I'm not a JS developer).

From that point, I think we need to:

  • Consider using first the non-CI options mentioned above
  • Be really clear on the acceptance criteria must be, so we don't imped legit users.

From my perspective, either suggested solution would prevent nonsenses like #2255 .

GuillaumeDua

GuillaumeDua commented on Aug 9, 2025

@GuillaumeDua
Author

Bump 😉, as #2283 .

Using authors public informations available on github,
theory then is that this repo is mentioned in some training of universities, located in India, delivering a master in computer science :

  • Chitkara University, Punjab, India
  • Gauhati University, Guwahati, Assam, India

3 options then:

  • Consider non-CI options, as mentioned above in the Suggested solutions section 🙏
  • Consider a CI option, but we need some guideline
  • We can contact such issue authors to gather more infos.
jwakely

jwakely commented on Aug 9, 2025

@jwakely
Contributor

We could also just contact the universities and tell them to educate their students properly so they don't create nonsense tickets here.

GuillaumeDua

GuillaumeDua commented on Aug 9, 2025

@GuillaumeDua
Author

Agree. Who do you encompass as "We" ? Shall I do it ?

jwakely

jwakely commented on Aug 9, 2025

@jwakely
Contributor

If you want to, otherwise I can do it late next week

BenjamenMeyer

BenjamenMeyer commented on Aug 11, 2025

@BenjamenMeyer

We could also just contact the universities and tell them to educate their students properly so they don't create nonsense tickets here.

They'd have to educate the professors too. I think too many of these are universities in China/India where a professor tells them to contribute to open source so they go and create a PR/Issue on something popular to fulfill the assignment.

GuillaumeDua

GuillaumeDua commented on Aug 11, 2025

@GuillaumeDua
Author

@BenjamenMeyer Totally agree. However, It sounds quite an unrealistic thing to do right now, that's why I created this issue so we can dig into technical solutions.

@jwakely Sure thing, I'll do it and update this issue if I get any feedback.

GuillaumeDua

GuillaumeDua commented on Aug 12, 2025

@GuillaumeDua
Author

@jwakely Done.
Disclaimer: I used ChatGPT to produce a formal, professional email which focuses on education rather than blame, while making clear the problem and its impact.

The email was sent to 3 universities + their local variants.

  • Chitkara University
  • Graphic Era Hill University
  • Gauhati University

To whom it may concern,

I am writing on behalf of the C++ Core Guidelines project on GitHub, an active open-source initiative that welcomes constructive contributions from developers worldwide, including students.

Recently, our project has seen a significant increase in low-value or spam contributions, often in the form of pull requests or issues with empty or irrelevant titles and descriptions (e.g., “my first contribution to open-source”), or unrelated changes with no meaningful improvement to the project.

In many cases, we suspect such contributions to be created by students completing coursework assignments to “contribute to open source.” While we appreciate the intent to encourage students to engage with the open-source community, such submissions cause several problems:

They create noise that makes it harder for maintainers to address real issues and review genuine contributions.

  • They produce spam notification emails to about 2000 active watchers

  • They disrupt ongoing technical work and delay legitimate changes.

  • They risk harming the reputation of the student and the university in the global developer community.

We respectfully request your assistance in educating students and faculty on the importance of making meaningful, relevant, and respectful contributions to open-source projects. For example:

  • Before creating an issue or pull request, students should read the project’s contribution guidelines and code of conduct.

  • Contributions should address a real problem, bug, or enhancement relevant to the project.

  • Provide clear, detailed explanations in titles and descriptions.

  • For practice assignments, consider creating internal or sandbox repositories, rather than submitting to unrelated projects.

Open source thrives on collaboration, mutual respect, and purposeful work. By preparing students to participate meaningfully, we can ensure a positive experience for them and for the projects they engage with.

We appreciate your understanding and cooperation in this matter, and we welcome any questions or discussions on how to better guide students toward valuable open-source participation.

Sincerely,

GuillaumeDua

GuillaumeDua commented on Sep 23, 2025

@GuillaumeDua
Author

FYI, none of the university replied to the email so far ... 🤷

LegalizeAdulthood

LegalizeAdulthood commented on Sep 23, 2025

@LegalizeAdulthood

It seems they don't care, in which case IMO they should be blacklisted from submitting PRs

jwakely

jwakely commented on Sep 24, 2025

@jwakely
Contributor

It seems they don't care, in which case IMO they should be blacklisted from submitting PRs

How? The clowns opening dumb issues and PRs don't belong to some GitHub org, they're just users on this site. We don't have a list of their usernames.

BenjamenMeyer

BenjamenMeyer commented on Sep 24, 2025

@BenjamenMeyer

It seems they don't care, in which case IMO they should be blacklisted from submitting PRs

How? The clowns opening dumb issues and PRs don't belong to some GitHub org, they're just users on this site. We don't have a list of their usernames.

And further, they accounts tend to be new accounts (<30 days) with this being their first/only PR/issue (#2285). @GuillaumeDua 's PR looks decent for checks that way.

FYI, none of the university replied to the email so far ... 🤷

Perhaps another method would be to start publicizing the issue in various media - LinkedIn, etc - to get wider attention- especially if we could hit some that were in the Academic circles - IEEE, ACM; ACM seems to have a sub-council for member activities there.

GuillaumeDua

GuillaumeDua commented on Sep 24, 2025

@GuillaumeDua
Author

From my perspective there's two approach, which can be done in parallel

  • As engineer, we might tend to technical solution with something like what I POCed in my PR. We still need to establish a clear set of rules tho, making sure such a solution don't become an impediment for legit users.
    Also, as we're speaking of Engineering students, there's always a risk that they'll always find some technical workaround to whatever solution we'll put in place. I don't want this to become a game.

    BTW did you try Settings -> Moderation options -> Temporary interaction limits -> Limit to existing users ? Perhaps it's enough ?

Image

As a side note, there's always worst issues than this one.

jwakely

jwakely commented on Sep 24, 2025

@jwakely
Contributor

yeah I've had about six of those coin scams this week, they create new ones as soon as github delete them.

Sqeaky

Sqeaky commented on Oct 7, 2025

@Sqeaky

Did Apna College, or a similar youtube channel, use this repo in one of their example videos? I do not speak the language that the hosts and staff speak, perhaps Hindi. Do we have anyone who can reach out and ask if they used this repo and an example in their language? There is an outside chance they will be more responsive than universities and if not youtube may have options.

For background Brodie Robertson described a possible source of these issues (20m video) for another repo, ExpressJS, with a similar spam problem, with people adding their names or gibberish to their Readme.md.

Apna College has 119 videos on C++ and plenty more on git. I will look through them and see if I can find any specifically actionable section and I will poke around looking for others who might have used this repo as an example. But that is days or weeks of videos, I can't promise I will find anything.

EDIT - To be clear I am hoping that video creators will want to be helpful if the spam is originating from students learning from them. Newer videos from a lot of these creators redact parts repo and org names, so I think they know that they might cause problems but may not have fixed older videos. Some videos even include demo repos setup just for students to experiment with.

jwakely

jwakely commented on Oct 7, 2025

@jwakely
Contributor

Well apparently the disreputable Apna College are well aware of the problem, and just delete critical comments on their video that is the source of the Express.js spam. So they clearly don't care and don't want to do anything to stop it.

It's probably a different but similar course that uses this repo for their example, but I suspect you're close to the truth.

LegalizeAdulthood

LegalizeAdulthood commented on Oct 7, 2025

@LegalizeAdulthood

So they clearly don't care and don't want to do anything to stop it.

This is why we can't have nice things.

Sqeaky

Sqeaky commented on Oct 8, 2025

@Sqeaky

I am glad you feel this isn't wasted effort. I will continue around my other tasks. Also considering how these videos are spread socially and how they are likely to be shared when courses spin up, it would explain the periodic ebb and flow of spam.

Comments probably aren't the best way to reach them. Those are frequently handled by staffers with little authority or automated systems. Particularly considering that English is a second language for them. IF they want to be good digital citizen the signal-to-noise ratio of comments sections makes serious discussion in comment sections impractical.

I found one example of Apna College using ExpressJS about 1:12 in this video: https://www.youtube.com/watch?v=Ez8F0nW6S-w and I have gone through their git videos mostly by skimming youtube timeline screenshots and looking for things that are recognizably Github. Now for the 119 dubiously autotranslated C++ videos. I suspect most won't touch git but I will go through them.

Sqeaky

Sqeaky commented on Oct 17, 2025

@Sqeaky

I have not reviewed every video, but I have reviewed at least a few from each section of their main C++ playlist, any referring to git, any referring to external tools, any with logos in the thumbnail, and any with visually distinct thumbnails.

There were some odd, perhaps IP infringing, uses of Facebook's, Google's, Amazon's, and Visa's logos, when referring to the companies' interview question.

Lot's of leetcode (data structure and algorithm lessons), a few google sheets, Google search, Mingw.org, VsCode web pages including the VsCode marketplace, but nothing even slightly problematic on the main C++ playlist. ( for reference: https://www.youtube.com/watch?v=0Cu9Kg7RJYg&list=PLfqMhTWNBTe0b2nM6JHVCnAkhQRGiZMSJ&index=240 ) They had 5 other videos with git in the title and they had the expressjs thing but redacted other github repos except where they provided it themselves.

Is there another educational group or something online that might be using us as an example? I will take no further action until we have another lead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @BenjamenMeyer@Sqeaky@jwakely@LegalizeAdulthood@GuillaumeDua

        Issue actions

          [Repository] Limit/prevent spam issues/PRs · Issue #2258 · isocpp/CppCoreGuidelines