Skip to content

Latest commit

 

History

History
404 lines (348 loc) · 14.7 KB

File metadata and controls

404 lines (348 loc) · 14.7 KB

UniMatch Kenya Architecture & System Design

System Overview

UniMatch Kenya is a containerized, event-driven, scalable dating platform built with modern microservices-ready architecture.

┌─────────────────────────────────────────────────────────────┐
│                    Client Layer                             │
│  ┌──────────────┐         ┌──────────────┐                 │
│  │  Web Browser │         │   Mobile App │ (Future)        │
│  │  Next.js 14  │         │React Native  │                 │
│  └──────┬───────┘         └──────┬───────┘                 │
└─────────┼──────────────────────────┼───────────────────────┘
          │                          │
          │                          │
┌─────────▼──────────────────────────▼───────────────────────┐
│              API Gateway / Reverse Proxy                   │
│                  Nginx                                     │
│         (Rate Limiting, Compression, Routing)              │
└────────────────────┬──────────────────────────────────────┘
                     │
        ┌────────────┴──────────────┐
        │                           │
┌───────▼──────────────┐   ┌────────▼────────┐
│   HTTP Endpoints     │   │  WebSocket      │
│   REST API / DRF     │   │  Django Channels│
└───────┬──────────────┘   └────────┬────────┘
        │                           │
┌───────▼─────────────────────────▼┬───────────────────────┐
│          Application Layer (Django)                       │
│                                                           │
│  ┌─────────────┐ ┌──────────┐ ┌──────────────┐          │
│  │ Users App   │ │Profiles  │ │ Matching     │          │
│  │             │ │          │ │              │          │
│  │ • Auth      │ │ • Photos │ │ • Swipe Algo │          │
│  │ • OTP       │ │ • Bio    │ │ • Discover   │          │
│  │ • Verify    │ │ • Intrs  │ │ • Match Mgmt │          │
│  └─────────────┘ └──────────┘ └──────────────┘          │
│                                                           │
│  ┌─────────────┐ ┌──────────┐ ┌──────────────┐          │
│  │ Messaging   │ │Notif's   │ │ Reports      │          │
│  │             │ │          │ │              │          │
│  │ • Messages  │ │ • Push   │ │ • Report Mgmt│          │
│  │ • Chats     │ │ • In-app │ │ • Moderation │          │
│  │ • RTU       │ │ • Webhk  │ │ • Suspension │          │
│  └─────────────┘ └──────────┘ └──────────────┘          │
│                                                           │
│  ┌──────────────────────────────────────────┐            │
│  │  Celery Task Queue (Background Jobs)     │            │
│  │  • Email delivery  • Cache refresh       │            │
│  │  • Notifications   • Cleanup tasks       │            │
│  └──────────────────────────────────────────┘            │
└───────────┬─────────────────────────────────────────────┘
            │
    ┌───────┴─────────┬──────────┬────────────┐
    │                 │          │            │
┌───▼────┐   ┌────────▼──┐  ┌────▼─────┐  ┌─▼──────┐
│Database │   │   Cache   │  │ Task     │  │Uploads │
│         │   │           │  │Broker    │  │        │
│PostgreSQL   │  Redis 7   │  │ Redis    │  │Cloud   │
│            │           │  │          │  │inary  │
└─────────┘   └───────────┘  └──────────┘  └────────┘

Data Flow Patterns

1. User Registration Flow

User Register
  ↓
Validate email domain (.ac.ke, .edu)
  ↓
Create User + VerificationOTP
  ↓
[Celery Task] Send OTP Email
  ↓
User Receives Email
  ↓
User Verifies OTP
  ↓
is_verified = True
  ↓
Navigate to Profile Setup Onboarding

2. Discover Feed Flow

User Requests Feed
  ↓
Check Redis Cache (discover_feed:{user_id})
  ├─ HIT → Return Cached Results
  └─ MISS → Calculate Fresh Results
       ↓
Fetch Candidates (filtered query)
  ↓
Score & Sort Algorithm
  ├─ Same campus (+100)
  ├─ Same year (+50)
  ├─ Shared interests (+25)
  ├─ Recently active (+20)
  └─ Profile complete (+10)
  ↓
Serialize 20 Profiles
  ↓
Cache for 10 minutes
  ↓
Return to Client

3. Match Creation Flow

User A Swipes Right on User B
  ↓
Check if User B Liked User A
├─ NO → Create Swype(User A → User B, LIKE)
│       Response: "Like sent"
└─ YES → Mutual Like Detected
         ↓
         Create Match(User A ↔ User B)
         ↓
         [Parallel Tasks]
         ├─ Send Notification(User A): "Match with User B"
         ├─ Send Notification(User B): "Match with User A"
         ├─ Clear Discovery Cache(User A & B)
         └─ Log Match Event
         ↓
         WebSocket: Emit Match Event
         ↓
         Client: Show Match Celebration

4. Real-time Chat Flow

User Opens Chat
  ↓
WebSocket Connection Established
  ws://localhost/ws/chat/{match_id}/
  ↓
Load Previous Messages (Paginated)
  ↓
Subscribe to Channel Layer
  ↓
User Sends Message
  ├─ Optimistic Update (Client-side)
  ├─ HTTP POST or WS Send
  ├─ Save to Database
  ├─ Broadcast via Channels
  └─ Update Other User (WebSocket)
     ↓
Other User Receives
  ├─ Display in Chat
  ├─ Mark as delivered
  └─ Auto: Mark as read if visible

Database Normalization & Indexing Strategy

Normalization (3NF)

All tables follow Third Normal Form:

  • Each table has primary key (UUID)
  • No partial dependencies
  • No transitive dependencies
  • Foreign keys properly established

Critical Indexes

-- Prevent duplicate swipes (Unique Index)
CREATE UNIQUE INDEX idx_swipes_unique 
  ON swipes(swiper_id, swiped_id);

-- Fast discovery queries
CREATE INDEX idx_users_campus_active 
  ON users(campus, last_active DESC) 
  WHERE is_verified = true AND is_visible = true;

-- Chat history retrieval
CREATE INDEX idx_messages_match_time 
  ON messages(match_id, created_at DESC);

-- Quick notification fetching
CREATE INDEX idx_notifications_user_read 
  ON notifications(user_id, is_read, created_at DESC);

-- Report tracking
CREATE INDEX idx_reports_user_status 
  ON reports(reported_user_id, status);

Caching Strategy

Cache Layers

┌────────────────────────────────────────┐
│      Browser Cache (HTTP Headers)      │  1 year for assets
├────────────────────────────────────────┤
│      CDN Cache (Cloudflare)            │  1 day for images
├────────────────────────────────────────┤
│      Backend HTTP Cache (Nginx)        │  5 min for API
├────────────────────────────────────────┤
│      Application Cache (Redis)         │  10 min discover feed
│      ├─ discover_feed:{user_id}        │  Cache user candidates
│      ├─ user_profile:{user_id}         │  5 min user data
│      ├─ match_list:{user_id}           │  2 min active matches
│      └─ notifications:{user_id}        │  5 min notifications
└────────────────────────────────────────┘

Cache Invalidation

# When does cache get cleared?

1. User swipesClear discover_feed
2. Profile updatedClear user_profile & discover_feed
3. New matchClear discover_feed for both users
4. Message sentUpdate match_list
5. New notificationInvalidate notifications
6. Scheduled refreshEvery 10 minutes (Celery Beat)

Authentication & Security Architecture

┌─────────────────────────────────────────┐
│         Frontend (Next.js)              │
│  localStorage: access_token, refresh    │
│                                         │
│  axios.interceptors.response:           │
│  ├─ 401 Unauthorized?                   │
│  │  └─ Call /auth/refresh/              │
│  └─ Update header with new token        │
└────────────────┬────────────────────────┘
                 │
        Authorization: Bearer {token}
                 │
         ┌───────▼──────────┐
         │  Django Channels │
         │  (JWT Middleware)│   Validates WebSocket auth
         └────────┬─────────┘
                  │
        ┌─────────▼────────────┐
        │  DRF Authentication  │
        │  - TokenAuthenticate │
        │  (JWT validation)    │
        └─────────┬────────────┘
                  │
        ┌─────────▼────────────┐
        │ Custom Permissions   │
        ├─────────────────────┤
        │ • IsAuthenticated   │
        │ • IsVerified        │
        │ • IsProfileComplete │
        └─────────────────────┘

Scalability Considerations

Horizontal Scaling

Load Balancer (Nginx)
    ↓
    ├─→ Django Instance 1 (Containers ×2)
    ├─→ Django Instance 2 (Containers ×2)
    ├─→ Django Instance 3 (Containers ×2)
    └─→ Django Instance N (Containers ×2)

    All sharing:
    ├─ PostgreSQL Connection Pool
    ├─ Redis Broker
    └─ Celery Workers (Separate Containers)

Database Scaling Path

Stage 1: Single PostgreSQL
  └─ Connection Pool: 20+10

Stage 2: Read Replicas (>1000 users)
  ├─ Primary (Writes)
  └─ Read Replica (Reads)
     └─ Requires ORM query routing

Stage 3: Sharding (>10k users)
  ├─ Shard by user_id
  ├─ Shard by campus (geographical)
  └─ Cross-shard queries become complex

Celery Task Processing

Tasks arrive in Redis Queue
    ↓
Multiple Workers process equally
    ├─ Email tasks (Low priority, async)
    ├─ Notification tasks (Medium priority)
    └─ Cleanup tasks (Scheduled, batch)

Celery Beat (Scheduler)
    ├─ Every hour: Delete unverified users
    ├─ Every 10 min: Refresh caches
    └─ Every 24 hours: Backup reports

Error Handling & Recovery

Request Error Handling

Client Request
    ↓
Django Middleware
    ├─ 500: Server error (log to Sentry)
    ├─ 429: Rate limited (return Retry-After)
    ├─ 401: Unauthorized (return new token)
    └─ 400: Bad request (return validation errors)
    ↓
Custom Exception Handler
    └─ Consistent JSON response format
         {
           "error": "...",
           "status_code": 400,
           "detail": {...}
         }

Database Failure Handling

DB Connection Failed
    ↓
Connection Pool Retry (3 attempts)
    ├─ Success → Continue
    └─ Failure → Return 503 Service Unavailable
    
Slow Database Query (>1000ms)
    ├─ Log to monitoring
    ├─ Identify bottleneck
    └─ Add index OR optimize query

WebSocket Disconnection Handling

Client Disconnects
    ├─ Normal:
    │  └─ Graceful close, cleanup resources
    └─ Abnormal (network timeout):
       └─ Server waits 30 seconds
          └─ Reconnect auto-attempts (client-side)
             └─ On reconnect, sync messages
                └─ Load missed messages from DB

Monitoring & Observability

Metrics to Track

Real-time Dashboard:
├─ Active WebSocket connections
├─ API requests/sec (by endpoint)
├─ Response time percentiles (p50, p95, p99)
├─ Error rate (4xx, 5xx)
└─ Database query time

Database:
├─ Connection pool usage
├─ Slow query log (>100ms)
├─ Row count trends
└─ Cache hit rate

Celery:
├─ Queue depth
├─ Task success/failure rate
├─ Average task duration
└─ Worker availability

Alerts:
├─ High error rate (>1%)
├─ Response time spike
├─ DB connection exhaustion
├─ Redis memory high (>80%)
└─ Worker crash

Summary

This architecture is designed to: ✅ Scale horizontally - Add more containers/workers easily ✅ Handle 600+ concurrent users - Through caching, pooling, optimization ✅ Maintain low latency - <200ms p95 response time ✅ Ensure reliability - 99.9% uptime with proper monitoring ✅ Be maintainable - Clear separation of concerns, modular design