Frequent Pattern Mining Implementation

Project Overview

This project implements the Apriori algorithm for frequent itemset mining on Twitter data related to flu shots. The implementation analyzes keyword co-occurrence patterns in tweets to discover meaningful associations between terms.

Features

Implementation of the Apriori algorithm for frequent pattern mining
Support for processing large-scale Twitter datasets
Configurable minimum support threshold
Output ranking system for discovered patterns
Performance optimizations for handling large datasets

Key Components

Pattern Mining Algorithm: Core implementation of the Apriori algorithm
Data Processing: Handles text data with keyword separators
Performance Monitoring: Ensures efficient processing within specified time constraints
Results Generation: Creates formatted output of discovered patterns with support counts

Technical Details

The program accepts three command-line parameters:

Input dataset filename
Minimum support count threshold
Output filename

Top 20 Patterns

Author

Name: Jinghan (Summer) Sun
Email: jinghan.sun@emory.edu

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Analysis_codes.ipynb		Analysis_codes.ipynb
Apriori.py		Apriori.py
README.md		README.md
output.txt		output.txt
part2_report.pdf		part2_report.pdf
top20.png		top20.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Frequent Pattern Mining Implementation

Project Overview

Features

Key Components

Technical Details

Top 20 Patterns

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Frequent Pattern Mining Implementation

Project Overview

Features

Key Components

Technical Details

Top 20 Patterns

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages