Skip to content

The goal of this project is to analyze real time questions from Stack Overflow and clustering them based on title, body and tags. The results will be then displayed on dashboards.

Notifications You must be signed in to change notification settings

Quezal17/StackOverflow_Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StackOverflow - Analyzer

Developed by Simone Torrisi, Computer Science student at University of Catania

Project Goal

The goal of this project is to analyze real time questions from Stack Overflow and clustering them based on title, body and tags associated to the question. The results will be then displayed on dashboards.

You can get more information visiting docs, Kafka and Spark directories.

Technologies used

Project Structure

How to execute the project

Downloads

  • Apache Kafka: download from here and put the tgz file into Kafka/Setup directory.
  • Apache Spark: download from here and put the tgz file into Spark/Setup directory.
In addition, it is required that Docker and Apache Maven have been already installed.

Initial setup

To start the initial setup the following script initial-setup.sh has to be executed in the main directory.

There are two options:

  • Using bash command: bash initial-setup.sh
  • Making script executable: chmod +x initial-setup.sh and then ./initial-setup.sh

Start project

After the previous step is completed, the project can be started by using the code docker-compose up

About

The goal of this project is to analyze real time questions from Stack Overflow and clustering them based on title, body and tags. The results will be then displayed on dashboards.

Resources

Stars

Watchers

Forks