🤖 Reddit AI Curator

Reddit AI Curator is an advanced, AI-powered information retrieval system designed to find high-quality, relevant Reddit discussions. It combines professional Boolean search logic with Large Language Model (LLM) analysis to sift through thousands of posts and deliver the most impactful results.

🚀 Key Features

🏆 Multi-Query Tournament

Doesn't just run one search. It generates multiple query variations (Broad, Specific, Narrative, Jargon) and runs a "tournament" on a sample size to see which one performs best before committing to a full search.

🌊 Smart Search Cascade

If it doesn't find enough high-quality posts, it automatically triggers a tiered fallback system:

Sort Variation: Retries with relevance, top, hot, and comments sorts.
Variant Fallback: Uses the runner-up queries from the tournament.
Expansion: Increases fetch limits to 500 posts per sub and expands the time filter.
Adaptive Scoring: Intelligently relaxes the quality threshold (from 80 to 70/60) if the quota is still unmet.

🧠 Intent-Based Search (New!)

An advanced search mode that understands user intent through interactive clarification:

Semantic Decomposition: Breaks requests into core requirements, bonus criteria, and preferences
Interactive Clarification: AI asks targeted questions to resolve ambiguities
Multi-Stage Scoring: 5-stage scoring algorithm (Disqualifiers → Core → Base → Bonus → Preference)
Parallel Execution: Runs alongside standard keyword search

📚 Continual Learning System

Tag Extraction: Automatically extracts semantic tags from high-scoring results to learn the "vocabulary" of successful matches.
Favorites: Save your favorite posts to train the AI. It will use your favorites to prioritize themes and keywords in future searches.
Auto-Blacklist: Automatically blacklists posts scoring >85 to ensure every new search provides fresh content.

🛠️ Setup & Installation

Python 3.12+

Install Dependencies:

pip install praw mistralai google-generativeai flask python-dotenv PyJWT

Configure Environment: Create a .env file with your credentials:

REDDIT_CLIENT_ID=your_id
REDDIT_CLIENT_SECRET=your_secret
REDDIT_USER_AGENT=your_agent
MISTRAL_API_KEY=your_key  # or GOOGLE_API_KEY
JWT_SECRET_KEY=your_jwt_secret
JWT_ALGORITHM=HS256
JWT_EXPIRATION_HOURS=24

🔐 JWT Authentication

The V2 API requires JWT authentication for all protected endpoints. Generate a token using the auth endpoints:

# Get access token (POST /api/v2/auth/token)
curl -X POST http://localhost:5000/api/v2/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "your_password"}'

# Use token in requests
curl http://localhost:5000/api/v2/search \
  -H "Authorization: Bearer <your_jwt_token>"

🖥️ Usage

CLI Commands

1. AI-Powered Curation (Best Results) Generates queries, runs a tournament, and performs the search cascade automatically.

python app.py curate --description "First person stories of car accidents in heavy rain" --target_posts 10

2. Direct Boolean Search

python app.py search --keywords "car accident AND (rain OR storm)" --criteria "High detail stories only"

3. Discover Subreddits Finds subreddits that are most likely to contain the content you are looking for.

python app.py discover --keywords "adventure travel and hiking"

Optional Flags:

--exhaustive: Try all sorts and variants for maximum recall.
--no-fallback: Disable the automatic cascade.
--json: Output raw data only.

🌐 Web Interface

Launch the interactive dashboard to run searches, view live results, manage subreddits, and browse your favorites.

python app.py
# Open http://localhost:5000 in your browser

🏗️ Architecture & DI Container

Reddit AI Curator uses a Dependency Injection (DI) Container for service management:

Dependency Injection Container

The DI container (app/core/container.py) manages all service dependencies:

from app.core.container import container

# Get services from container
llm_provider = container.llm_provider
search_engine = container.search_engine
reddit_engine = container.reddit_engine

Service Registration

Services are registered in app/core/service_registration.py:

Service	Interface	Description
`llm_provider`	`LLMProvider`	LLM interface (Mistral, Gemini, or Mock)
`reddit_engine`	`RedditSearchEngine`	Reddit API client
`search_engine`	`SearchEngine`	Main search orchestration

V2 API Endpoints

All V2 API endpoints require JWT authentication:

Endpoint	Method	Description
`/api/v2/auth/token`	POST	Get JWT access token
`/api/v2/search`	POST	Execute standard AI-powered search
`/api/v2/search/intent/analyze`	POST	Analyze intent & get clarification questions
`/api/v2/search/intent/clarify`	POST	Submit clarification answers
`/api/v2/search/intent/execute`	POST	Execute search with finalized intent
`/api/v2/search/intent/quick`	POST	One-shot intent search (no clarification)
`/api/v2/llm/generate-queries`	POST	Generate query variants
`/api/v2/llm/score`	POST	Score a post with LLM
`/api/v2/health`	GET	Health check (no auth required)

Testing with MockLLMProvider

Use MockLLMProvider for testing without external API calls:

from app.core.container import container

# Replace with mock for testing
container.register_mock_llm_provider()

# Tests run without real LLM API calls
results = container.search_engine.search(...)

📂 Project Structure

app.py: Main application entry point (CLI & Web).
app/core/: Core architecture (DI container, service registration)
app/services/: LLM providers and search services
app/routes_v2.py: V2 API endpoints (JWT authenticated)
tag_learning.py: The "brain" that manages favorites and tag-based refinement.
report_generator.py: Generates the beautiful, standalone HTML reports.
config/: Centralized folder for all JSON data (favorites, learning DB, queries, blacklist).
static/ & templates/: Responsive frontend assets.
results/: Output folder for JSON and HTML findings.
tests/integration/: Zero-API integration tests (MockLLMProvider)

License: MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.backup		.backup
.github/workflows		.github/workflows
.opencode		.opencode
.sisyphus/plans		.sisyphus/plans
.tmp/sessions/2026-01-19-reddit-architecture-improvements		.tmp/sessions/2026-01-19-reddit-architecture-improvements
app		app
backups/json_backup_20260119_160812/config		backups/json_backup_20260119_160812/config
config		config
frontend-old		frontend-old
frontend		frontend
results		results
scripts		scripts
static		static
templates		templates
tests		tests
thoughts/ledgers		thoughts/ledgers
.coverage		.coverage
.env		.env
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CODE_STYLE.md		CODE_STYLE.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
app.py		app.py
cli.py		cli.py
debug_search.py		debug_search.py
demo_intent_search.py		demo_intent_search.py
docker-compose.yml		docker-compose.yml
install.sh		install.sh
openapi.yaml		openapi.yaml
optimized_prompts.py		optimized_prompts.py
pytest.ini		pytest.ini
report_generator.py		report_generator.py
requirements.txt		requirements.txt
search_log.json		search_log.json
session-ses_427b.md		session-ses_427b.md
session-ses_42b2.md		session-ses_42b2.md
start_flask.sh		start_flask.sh
tag_learning.py		tag_learning.py
test-cdp.js		test-cdp.js
test_db.py		test_db.py
update_db.py		update_db.py
update_db_v2.py		update_db_v2.py
update_db_v3.py		update_db_v3.py
verify_frontend.sh		verify_frontend.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Reddit AI Curator

🚀 Key Features

🏆 Multi-Query Tournament

🌊 Smart Search Cascade

🧠 Intent-Based Search (New!)

📚 Continual Learning System

🛠️ Setup & Installation

🔐 JWT Authentication

🖥️ Usage

CLI Commands

🌐 Web Interface

🏗️ Architecture & DI Container

Dependency Injection Container

Service Registration

V2 API Endpoints

Testing with MockLLMProvider

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Esashiero/reddit2

Folders and files

Latest commit

History

Repository files navigation

🤖 Reddit AI Curator

🚀 Key Features

🏆 Multi-Query Tournament

🌊 Smart Search Cascade

🧠 Intent-Based Search (New!)

📚 Continual Learning System

🛠️ Setup & Installation

🔐 JWT Authentication

🖥️ Usage

CLI Commands

🌐 Web Interface

🏗️ Architecture & DI Container

Dependency Injection Container

Service Registration

V2 API Endpoints

Testing with MockLLMProvider

📂 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages