Visual AI Pipeline for Reddit Analysis - Drag-and-drop nodes to build investigation workflows that analyze social sentiment using Gemini AI.
ThreadFlow lets you visually investigate Reddit discussions by:
- Drag & drop nodes to build analysis pipelines (n8n-style)
- Filter data by score, keywords, or custom criteria
- AI-powered analysis - sentiment, bot detection, evidence extraction, summarization
- 3D visualizations - Canada map, political party breakdown, bar/pie charts
- Python 3.9+
- Node.js 18+
- Gemini API Key (Get one free)
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies (includes DuckDB)
pip install -r requirements.txt
# Create .env file with your Gemini API key
echo "GEMINI_API_KEY=your_gemini_api_key_here" > .env# Still in backend folder, with venv activated
python ingest.pyThis creates reddit_data.duckdb from the CSV files in the archive/ folder. Takes ~30 seconds.
Verify it worked:
python -c "import duckdb; db = duckdb.connect('reddit_data.duckdb'); print(f'Comments: {db.execute(\"SELECT COUNT(*) FROM comments\").fetchone()[0]}')"# Still in backend folder
uvicorn main:app --reload --port 8000Backend runs at http://localhost:8000. Test: curl http://localhost:8000/health
# Open new terminal
cd frontend
# Install dependencies
npm install # or: pnpm install
# Start dev server
npm run devFrontend runs at http://localhost:3000
- Open
http://localhost:3000 - Drag nodes from the sidebar onto the canvas
- Connect nodes by dragging from output (right) to input (left)
- Configure nodes - enter search queries, set filters, etc.
- Click "Run Pipeline" to execute
- View results in the nodes or click "View AI Visualization" for 3D charts
| Category | Nodes | Description |
|---|---|---|
| Source | Reddit Source, Thread Loader | Load data from Reddit |
| Filter | Score Filter, Keyword Sieve | Filter comments |
| AI | Sentiment, Evidence, Bot Hunter, Summarizer | Gemini-powered analysis |
| Viz | Data Table, Canada Map, Political, Bar/Pie Charts | Visualize results |
├── backend/
│ ├── main.py # FastAPI server
│ ├── ai_analyzer.py # Gemini AI integration
│ ├── ingest.py # DuckDB database builder
│ ├── requirements.txt
│ └── .env # GEMINI_API_KEY goes here
├── frontend/
│ ├── src/
│ │ ├── app/ # Next.js pages
│ │ ├── components/ # React components (nodes, canvas, viz)
│ │ └── lib/ # API client, pipeline logic
│ └── package.json
└── archive/ # Source CSV data
├── canada_subreddit_comments.csv
└── canada_subreddit_threads.csv
| Issue | Solution |
|---|---|
reddit_data.duckdb not found |
Run python ingest.py in backend folder |
GEMINI_API_KEY not set |
Create .env file in backend with your key |
Connection refused |
Make sure backend is running on port 8000 |
Rate limit errors |
Gemini has 15 req/min limit - the app handles this automatically |
- Frontend: Next.js 15, React Flow, Three.js, TailwindCSS
- Backend: FastAPI, DuckDB, Google Gemini AI
- Data: Reddit r/Canada subreddit (comments + threads)
MIT - Built for AI Collective Hackathon 2026