Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Participatory Approaches to Building Datasets on Abuse #594

Open
tarunima opened this issue Jul 1, 2024 · 3 comments
Open

Participatory Approaches to Building Datasets on Abuse #594

tarunima opened this issue Jul 1, 2024 · 3 comments
Labels
enhancement New feature or request stale

Comments

@tarunima
Copy link
Collaborator

tarunima commented Jul 1, 2024

Description:

Automated approaches to abuse detection rely on annotated datasets. At least at present, unsupervised machine learning alone cannot detect abuse across languages. To fill the gap of abuse detection datasets in India languages, Tattle started the Uli project to specifically create datasets on gendered abuse in Indian languages.But the focus is also to take a survivor centered perspective on abuse. The datasets was created with people of marginalized genders at the receiving end of abuse. The first dataset on abusive tweets helped us develop a methodology for participatory datasets that we would now like to extend to more languages and modalities.

The Scope of This Task:

  1. Review literature about datasets of abuse detection in images, videos and audio.
  2. Create a dataset of images from social media that could be annotated by the existing community of researchers, survivors, activists.
  3. Expand the community of annotators
  4. Qualitative research to define abuse in multimodal datasets
  5. Organize annotations
  6. Release the dataset.

This ticket should be treated as a statement of intent for a multi-year project. If you're interested in collaborating on this project, please leave a comment.

@tarunima tarunima added the enhancement New feature or request label Jul 1, 2024
Copy link

github-actions bot commented Aug 1, 2024

This issue is stale because it has been open for 30 days with no activity.

@callmesanfornow
Copy link

Hi! Is this task still considering participants? I am interested in volunteering.

I research Online Hate Speech in low-resource settings. I have experience in curating datasets for gender-based stereotypes and I have worked on Multi-Modal Audio Abuse Detection in Low Resource Settings.

Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants