Skip to content

Latest commit

Β 

History

History
129 lines (84 loc) Β· 3.2 KB

File metadata and controls

129 lines (84 loc) Β· 3.2 KB

πŸš€ AI Lead Scoring Engine β€” Cleardeals Assignment

An end-to-end AI-powered real-time Lead Scoring System designed to help brokers prioritize high-intent leads by predicting an intent score using machine learning and semantic reranking.


πŸ“Œ Problem Statement

Brokers spend too much time on low-intent leads. Goal: Predict high-intent leads and improve conversion efficiency.

This solution delivers real-time intent scores via API using:

  • A trained Gradient Boosted Tree (XGBoost) model
  • Optional LLM re-ranking using MiniLM embeddings
  • Redis for fast caching
  • FastAPI to expose a scoring endpoint
  • Async simulation of CRM push integration

🧰 Tech Stack

Component Technology
ML Model XGBoost
API Server FastAPI + Uvicorn
Re-Ranker (LLM) SentenceTransformers (MiniLM)
Cache Redis
Async Requests httpx
Dataset Synthetic (1000 leads)

βš™οΈ Features

  • /score β†’ Accepts lead info, returns predicted intent score in <300ms
  • Redis caching for repeated leads
  • /rerank β†’ Reranks top leads using semantic relevance
  • Async CRM push simulation
  • Handles edge cases and missing fields gracefully

πŸ“ Project Structure

cdt/
β”œβ”€β”€ app.py                 # FastAPI server
β”œβ”€β”€ model_utils.py         # Model loading & prediction
β”œβ”€β”€ llm_reranker.py        # LLM-based lead reranking
β”œβ”€β”€ train_model.py         # Training script (XGBoost)
β”œβ”€β”€ generate_fake_leads.py # Creates 1000-lead dataset
β”œβ”€β”€ leads.csv              # Simulated dataset
β”œβ”€β”€ lead_score_model.pkl   # Trained XGBoost model
β”œβ”€β”€ feature_columns.json   # Feature order mapping
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env                   # Optional Redis config
└── README.md

πŸš€ Running the Project

1. Install dependencies

pip install -r requirements.txt

2. Start Redis

redis-server

3. Run the API

uvicorn app:app --reload

4. Access API Docs

Visit: http://127.0.0.1:8000/docs


πŸ§ͺ Endpoints

/score β€” POST

Predicts intent score in real-time. Accepts lead data.

/rerank β€” POST

Accepts list of top leads + a target_intent string. Returns re-ranked leads by semantic similarity using MiniLM.


🧠 How It Works

  • XGBoost model scores leads from tabular features
  • Optional LLM (MiniLM) computes semantic similarity for top leads
  • Redis caches past results
  • Async CRM webhook simulated with httpx

πŸ” Compliance

  • No PII is stored or exposed
  • All sensitive fields are excluded from scoring pipeline
  • Designed to be DPDP-ready

✍️ Author

Dharmik Sompura

πŸ“§ Email: [your_dharmiksompura1212@gmail.com]

πŸ”— GitHub: [https://github.com/Dharmik0712]

πŸ”— LinkedIn: [https://www.linkedin.com/in/dharmik-sompura/]