Skip to content

srijanravisankar/ThreadFlow

 
 

Repository files navigation

ThreadFlow 🧵

Visual AI Pipeline for Reddit Analysis - Drag-and-drop nodes to build investigation workflows that analyze social sentiment using Gemini AI.

ThreadFlow Demo

🎯 What It Does

ThreadFlow lets you visually investigate Reddit discussions by:

  • Drag & drop nodes to build analysis pipelines (n8n-style)
  • Filter data by score, keywords, or custom criteria
  • AI-powered analysis - sentiment, bot detection, evidence extraction, summarization
  • 3D visualizations - Canada map, political party breakdown, bar/pie charts

⚡ Quick Setup (5 minutes)

Prerequisites

1️⃣ Clone & Setup Backend

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies (includes DuckDB)
pip install -r requirements.txt

# Create .env file with your Gemini API key
echo "GEMINI_API_KEY=your_gemini_api_key_here" > .env

2️⃣ Build the Database (DuckDB)

⚠️ IMPORTANT: The app uses DuckDB to query Reddit data. You must build the database first:

# Still in backend folder, with venv activated
python ingest.py

This creates reddit_data.duckdb from the CSV files in the archive/ folder. Takes ~30 seconds.

Verify it worked:

python -c "import duckdb; db = duckdb.connect('reddit_data.duckdb'); print(f'Comments: {db.execute(\"SELECT COUNT(*) FROM comments\").fetchone()[0]}')"

3️⃣ Start Backend Server

# Still in backend folder
uvicorn main:app --reload --port 8000

Backend runs at http://localhost:8000. Test: curl http://localhost:8000/health

4️⃣ Setup & Start Frontend

# Open new terminal
cd frontend

# Install dependencies
npm install  # or: pnpm install

# Start dev server
npm run dev

Frontend runs at http://localhost:3000


🎮 How to Use

  1. Open http://localhost:3000
  2. Drag nodes from the sidebar onto the canvas
  3. Connect nodes by dragging from output (right) to input (left)
  4. Configure nodes - enter search queries, set filters, etc.
  5. Click "Run Pipeline" to execute
  6. View results in the nodes or click "View AI Visualization" for 3D charts

Node Types

Category Nodes Description
Source Reddit Source, Thread Loader Load data from Reddit
Filter Score Filter, Keyword Sieve Filter comments
AI Sentiment, Evidence, Bot Hunter, Summarizer Gemini-powered analysis
Viz Data Table, Canada Map, Political, Bar/Pie Charts Visualize results

📁 Project Structure

├── backend/
│   ├── main.py          # FastAPI server
│   ├── ai_analyzer.py   # Gemini AI integration
│   ├── ingest.py        # DuckDB database builder
│   ├── requirements.txt
│   └── .env             # GEMINI_API_KEY goes here
├── frontend/
│   ├── src/
│   │   ├── app/         # Next.js pages
│   │   ├── components/  # React components (nodes, canvas, viz)
│   │   └── lib/         # API client, pipeline logic
│   └── package.json
└── archive/             # Source CSV data
    ├── canada_subreddit_comments.csv
    └── canada_subreddit_threads.csv

🔧 Troubleshooting

Issue Solution
reddit_data.duckdb not found Run python ingest.py in backend folder
GEMINI_API_KEY not set Create .env file in backend with your key
Connection refused Make sure backend is running on port 8000
Rate limit errors Gemini has 15 req/min limit - the app handles this automatically

🛠 Tech Stack

  • Frontend: Next.js 15, React Flow, Three.js, TailwindCSS
  • Backend: FastAPI, DuckDB, Google Gemini AI
  • Data: Reddit r/Canada subreddit (comments + threads)

📜 License

MIT - Built for AI Collective Hackathon 2026

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 79.0%
  • Python 20.7%
  • Other 0.3%