You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Automated approaches to abuse detection rely on annotated datasets. At least at present, unsupervised machine learning alone cannot detect abuse across languages. To fill the gap of abuse detection datasets in India languages, Tattle started the Uli project to specifically create datasets on gendered abuse in Indian languages.But the focus is also to take a survivor centered perspective on abuse. The datasets was created with people of marginalized genders at the receiving end of abuse. The first dataset on abusive tweets helped us develop a methodology for participatory datasets that we would now like to extend to more languages and modalities.
The Scope of This Task:
Review literature about datasets of abuse detection in images, videos and audio.
Create a dataset of images from social media that could be annotated by the existing community of researchers, survivors, activists.
Expand the community of annotators
Qualitative research to define abuse in multimodal datasets
Organize annotations
Release the dataset.
This ticket should be treated as a statement of intent for a multi-year project. If you're interested in collaborating on this project, please leave a comment.
The text was updated successfully, but these errors were encountered:
Hi! Is this task still considering participants? I am interested in volunteering.
I research Online Hate Speech in low-resource settings. I have experience in curating datasets for gender-based stereotypes and I have worked on Multi-Modal Audio Abuse Detection in Low Resource Settings.
Description:
Automated approaches to abuse detection rely on annotated datasets. At least at present, unsupervised machine learning alone cannot detect abuse across languages. To fill the gap of abuse detection datasets in India languages, Tattle started the Uli project to specifically create datasets on gendered abuse in Indian languages.But the focus is also to take a survivor centered perspective on abuse. The datasets was created with people of marginalized genders at the receiving end of abuse. The first dataset on abusive tweets helped us develop a methodology for participatory datasets that we would now like to extend to more languages and modalities.
The Scope of This Task:
This ticket should be treated as a statement of intent for a multi-year project. If you're interested in collaborating on this project, please leave a comment.
The text was updated successfully, but these errors were encountered: