Social Media Bot Detection

This project implements a machine learning pipeline to classify social media tweets as bot-generated or human-generated using tweet-level linguistic and structural features.

Problem Statement

Automated bot accounts pose challenges on social media platforms by spreading spam and misinformation. The goal of this project is to detect bot-generated tweets using machine learning techniques without relying on temporal metadata.

Dataset

Source: Public bot detection dataset
Each row represents a single tweet
Labels:
- 0 → Human
- 1 → Bot

Approach

Performed exploratory data analysis to understand class distribution and feature variance
Cleaned tweet text and removed noise such as URLs, mentions, and hashtags
Engineered tweet-level features including:
- Tweet length
- Hashtag count
- Follower-based metrics
- Sentiment polarity
Trained an XGBoost classifier and evaluated performance using ROC-AUC and F1-score

Technologies Used

Python
Pandas, NumPy
Scikit-learn
XGBoost
TextBlob
Matplotlib

Results

The model demonstrates the effectiveness of combining text-based features with numerical metadata for bot detection.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Data		Data
Notebook		Notebook
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Media Bot Detection

Problem Statement

Dataset

Approach

Technologies Used

Results

About

Uh oh!

Releases

Packages

Languages

LaneAsade/Social_Media_Bot_Detection_System

Folders and files

Latest commit

History

Repository files navigation

Social Media Bot Detection

Problem Statement

Dataset

Approach

Technologies Used

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages