A platform developed for Police Department where police can gather data about an individual from different social media sites that is publicly available.
| Register Page | Login Page |
|---|---|
![]() |
![]() |
| Home Page | Data Fetched |
|---|---|
![]() |
![]() |
-
Phone Number Scraping
- Scrape data associated with mobile number in India.
- Get details of the sim location with proper address.
- Receive data in JSON format.
- Download data in CSV file.
- Screen Sharing
-
Tweets Toxicity Detection (X-factor / Flagship)
- Archive tweets of an individual and give toxicity details about it.
- Toxicity details are served on 7 different parameters.
- Uses Tensorflow Model to detect the toxicity category.
-
Twitter Scraper
- Search Twitter Account Details by providing username.
- Extract all the tweets of the individual.
- Receive data in JSON format.
- Download data in CSV file.
-
Linkedin Scraper
- Extract data from a LinkedIn profile of a user.
- Receive data in JSON format.
- Download data in CSV file.
-
Instagram Scraper
- Extract data from a Instagram profile of a user.
- Obtain profile image, no. of followers & following and much more.
- Receive data in JSON format.
- Download data in CSV file.
-
Facebook Scraper
- Extract data from a Facebook profile of a user.
- Obtain profile image, no. of followers & following and much more.
- Receive data in JSON format.
- Download data in CSV file.
-
Front End / Client Side
- ReactJS
- Bootstrap - CSS and other components
-
BackEnd Server ( Followed 2 backend architecture to distribute load on servers ):
-
Flask Backend
- Facebook Scraper - Intakes Facebook username or ID username and scrapes data from user profile dismentaling the site.
- Twitter Scraper - Intakes Twitter username and scrapes data from user profile dismentaling the site.
-
MongoDB Backend
- Phone Number Scraper - Scrapes data associated with a mobile number.
- Instagram Scraper - Intakes Instagram username and scrapes data from user profile dismentaling the site.
- LinkedIn Scraper - Intakes LinedkIn username and scrapes data from user profile dismentaling the site.
-
-
Data Management (Databases):
- MongoDB Atlas - Data management and user details
-
Install Git Version Control [ https://git-scm.com/ ]
-
Install Python Latest Version [ https://www.python.org/downloads/ ]
-
Install Pip (Package Manager) [ https://pip.pypa.io/en/stable/installing/ ]
-
Install MongoDB Compass and connect it to localhost 27017 [ Atlas Connection is quite slow and may not work everytime ]
git clone https://github.com/rajprem4214/ScrapperOP.gitGo to the project directory
cd ScraperOPBackend Server:
Go to backend folder
cd backend-pythonInstall Virtual Environment
pip install virtualenvCreate Virtual Environment:
virtualenv venvGo to venv folder and Activate virtual enviroment
cd venvRun the following command
.\Scripts\activate.ps1Go back to backend folder
cd ..Install Requirements from 'requirements.txt'
pip install -r requirements.txtStart the backend server
flask runStart the other backend server
cd FinalKSPNavigate to server folder
cd ServerRun the following command
nodemon index.jsFrontend Server:
Go to frontend folder
cd instaSCRAPInstall all dependencies
npm installStart frontend server
npm run start- Frontend is running on http://localhost:3000
- Backend is running on http://127.0.0.1:5000 and http://127.0.0.1:8000
- Reduced time in scraping data by distributing server load to 2 servers.
- In toxicity detection, instead of all tweets, some sorted tweets were passed for faster detection.
- Increase API rate limits to seamlessly extract data.
- Use cloud services architecture to improve performance of the platform.
- Prem Raj
- [Saishwar Anand]
- [Utsav Sinha]





