🐕 Woofalytics v2.5.0

AI-powered dog bark detection and cataloging

A complete modernization of the original woofalytics project, built for cataloging and fingerprinting barking dogs within earshot. Uses zero-shot audio classification (CLAP) to detect barks without training data, with automatic recording for documentation purposes.

Project Goals

This project was created with specific intentions:

Learning - Push modern Python patterns to the limits (deliberately over-engineered)
Dog Cataloging - Document and fingerprint all barking dogs within earshot
Best Practices - Latest patterns, proper architecture, comprehensive documentation

Key Features

Zero-Shot Bark Detection - CLAP-powered classification without training data (~500ms inference)
Multi-Layer Veto System - Rejects speech, percussion, and bird sounds to reduce false positives
Direction of Arrival (DOA) - Know which direction barks come from using stereo microphones
Evidence Recording - Automatic 30-second clips with JSON metadata sidecars
Dog Fingerprinting - Identify and track individual dogs by bark signature
Bark Management - Reassign barks to different dogs, untag, or delete directly from dog profiles
Last Heard Tracking - See when each dog was last detected with accurate timestamps
Webhook Notifications - Configurable webhooks for bark alerts with customizable payloads
Quiet Hours - Schedule reduced sensitivity periods (e.g., nighttime) via Settings UI
Clustering Analysis - Visual interface for analyzing untagged barks and creating dog profiles
Modern Web UI - Real-time dashboard with WebSocket updates and persistent statistics
Accessible by Design - Aims for WCAG AA compliance, screen reader support, respects motion preferences
REST API - Full OpenAPI documentation at /api/docs
Docker Support - Easy deployment with Docker Compose
Flexible Configuration - YAML config with environment variable overrides
AI Summaries - LLM-generated weekly/custom-range bark reports via Ollama (optional)
Legacy MLP Support - Optional TorchScript models for faster inference

Architecture Overview

┌───────────────────────────────────────────────────────────────┐
│                      FastAPI Application                      │
│  ┌────────────┐  ┌────────────┐  ┌────────────────────────┐   │
│  │  REST API  │  │  WebSocket │  │     Static Files       │   │
│  │  /api/*    │  │  /ws/bark  │  │     /static/*          │   │
│  │            │  │/ws/pipeline│  │                        │   │
│  └─────┬──────┘  └─────┬──────┘  └────────────────────────┘   │
│        │               │                                      │
│        └───────┬───────┘                                      │
│                ▼                                              │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │                     BarkDetector                        │  │
│  │  - Coordinates audio capture, inference, callbacks      │  │
│  │  - Runs inference loop every 500ms (CLAP) or 80ms (MLP) │  │
│  │  - Produces BarkEvent objects                           │  │
│  └─────────────────────────────────────────────────────────┘  │
│        │                │                   │                 │
│        ▼                ▼                   ▼                 │
│  ┌───────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │ Audio     │  │  VAD Gate   │  │     DOA Estimator       │  │
│  │ Capture   │  │ (fast skip) │  │  (Bartlett/Capon/MEM)   │  │
│  └───────────┘  └──────┬──────┘  └─────────────────────────┘  │
│                        ▼                                      │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │                    CLAP Detector                        │  │
│  │  - Zero-shot audio classification (laion/clap-htsat)    │  │
│  │  - Multi-label veto (speech, percussion, birds)         │  │
│  │  - Rolling window + high-confidence bypass              │  │ 
│  └─────────────────────────────────────────────────────────┘  │
│        │                                                      │
│        ▼                                                      │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │                   EvidenceStorage                       │  │
│  │  - Records WAV clips on bark detection                  │  │
│  │  - Creates JSON metadata sidecars                       │  │
│  │  - Maintains evidence index                             │  │
│  └─────────────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────┘

Data Flow

Audio Capture (audio/capture.py) runs in a background thread, filling a ring buffer
BarkDetector (detection/model.py) reads ~100 frames (1 second) from buffer every 500ms
VAD Gate (detection/vad.py) fast-rejects silent audio before expensive CLAP inference
CLAP Detector (detection/clap.py) runs zero-shot classification with multi-label veto:
- Compares "dog barking" against speech, percussion, bird, and other sound labels
- Uses rolling window (2/3 positives required) to smooth detections
- High-confidence barks (≥80%) bypass rolling window for instant detection
- Detection cooldown prevents rapid-fire triggers from the same sound
DOA Estimator (detection/doa.py) calculates direction using pyargus algorithms
BarkEvent is created and broadcast to all registered callbacks
EvidenceStorage (evidence/storage.py) records clips when barks are detected
WebSocket broadcasts events to connected web clients in real-time

Note: Legacy MLP mode uses 80ms inference with TorchScript for faster but less accurate detection.

Detection Pipeline

Woofalytics uses a multi-stage filtering approach to balance accuracy with performance:

Audio Input → VAD Gate → YAMNet Gate → CLAP Detector → Bark Event
                ↓            ↓              ↓
             (skip)       (skip)        (detect)

1. VAD Gate (Voice Activity Detection)

Purpose: Fast energy-based rejection of silent audio
Method: RMS energy threshold in dB
Skip Rate: ~60-80% of frames (environment dependent)
Latency: <1ms

2. YAMNet Gate (Pre-filter)

Purpose: Skip CLAP inference for non-dog sounds
Model: Google's YAMNet (TensorFlow, ~3.7M params)
Classes: AudioSet class 69 (Dog) and 70 (Bark)
Threshold: 0.05 (kept low to avoid missing barks)
Skip Rate: 30-40% of VAD-passed frames
Latency: ~50ms

3. CLAP Detector (Primary)

Purpose: Zero-shot audio classification with multi-label veto
Model: LAION CLAP (laion/clap-htsat-unfused)
Features:
- Compares bark labels against speech, percussion, birds
- Rolling window (2/3 positives required)
- High-confidence bypass (≥80%)
- Detection cooldown prevents rapid-fire
Latency: ~500ms

Monitor pipeline status in real-time via the Dashboard's Detection Pipeline card.

File Structure

woofalytics-v2/
├── src/woofalytics/             # Python backend
│   ├── __init__.py              # Package version and exports
│   ├── __main__.py              # CLI entry point (python -m woofalytics)
│   ├── app.py                   # FastAPI application with lifespan
│   ├── config.py                # Pydantic v2 settings system
│   │
│   ├── audio/
│   │   ├── __init__.py          # Module exports
│   │   ├── devices.py           # Microphone discovery (PyAudio wrapper)
│   │   └── capture.py           # Async audio capture with ring buffer
│   │
│   ├── detection/
│   │   ├── __init__.py          # Module exports
│   │   ├── model.py             # BarkDetector orchestrator + BarkEvent
│   │   ├── clap.py              # CLAP zero-shot classifier (primary)
│   │   ├── yamnet.py            # YAMNet pre-filter gate (TensorFlow)
│   │   ├── vad.py               # Voice activity detection gate
│   │   ├── features.py          # Mel filterbank feature extraction (legacy)
│   │   ├── doa.py               # Direction of arrival estimation
│   │   └── resample_cache.py    # Cached audio resampling
│   │
│   ├── events/
│   │   ├── __init__.py          # Module exports
│   │   ├── manager.py           # Notification manager orchestrator
│   │   ├── debouncer.py         # Per-dog notification debouncing
│   │   ├── models.py            # Event data models
│   │   └── webhook.py           # IFTTT and custom webhook delivery
│   │
│   ├── evidence/
│   │   ├── __init__.py          # Module exports
│   │   ├── storage.py           # Evidence recording and management
│   │   └── metadata.py          # JSON metadata models
│   │
│   ├── fingerprint/             # Dog identification system
│   │   ├── __init__.py
│   │   ├── storage.py           # SQLite fingerprint database
│   │   ├── matcher.py           # CLAP embedding matching
│   │   ├── extractor.py         # Feature extraction for fingerprints
│   │   ├── acoustic_features.py # Acoustic feature computation
│   │   ├── acoustic_matcher.py  # Acoustic similarity matching
│   │   ├── clustering.py        # HDBSCAN bark clustering
│   │   └── models.py            # Fingerprint data models
│   │
│   ├── observability/
│   │   ├── __init__.py          # Module exports
│   │   └── metrics.py           # Prometheus-format metrics
│   │
│   ├── prompts/
│   │   └── weekly_summary.prompty  # Jinja2 prompt template for AI summaries
│   │
│   └── api/
│       ├── __init__.py          # Module exports
│       ├── auth.py              # API key authentication
│       ├── ratelimit.py         # Rate limiting (slowapi)
│       ├── routes.py            # Core REST API endpoints
│       ├── routes_export.py     # CSV/JSON data export
│       ├── routes_fingerprint.py # Dog profiles and bark tagging
│       ├── routes_notification.py # Notification status
│       ├── routes_settings.py   # Runtime settings management
│       ├── routes_summary.py    # Daily/weekly/monthly summaries + AI
│       ├── schemas.py           # Core Pydantic response models
│       ├── schemas_export.py    # Export response models
│       ├── schemas_fingerprint.py # Fingerprint response models
│       ├── schemas_summary.py   # Summary response models
│       └── websocket.py         # WebSocket endpoints + ConnectionManager
│
├── frontend/                    # SvelteKit frontend (NASA Mission Control theme)
│   ├── src/
│   │   ├── routes/              # SvelteKit pages
│   │   │   ├── +page.svelte     # Dashboard with real-time monitoring
│   │   │   ├── dogs/            # Dog management page
│   │   │   ├── fingerprints/    # Fingerprints explorer
│   │   │   ├── reports/         # Bark activity reports
│   │   │   └── settings/        # Settings & maintenance
│   │   ├── lib/
│   │   │   ├── api/             # Type-safe API client (openapi-fetch)
│   │   │   ├── components/      # Reusable UI components
│   │   │   └── stores/          # Svelte stores for WebSocket state
│   │   └── app.css              # Global styles (glassmorphism theme)
│   ├── build/                   # Production build (gitignored)
│   ├── package.json
│   └── svelte.config.js
│
├── static/                      # Evidence audio files (served at /static)
│
├── models/
│   └── traced_model.pt          # TorchScript bark detection model
│
├── evidence/                    # Evidence recordings (created at runtime)
│
├── tests/
│   ├── __init__.py
│   ├── conftest.py              # Pytest fixtures
│   ├── test_api_routes.py       # API endpoint tests
│   ├── test_api_websocket.py    # WebSocket tests
│   ├── test_audio.py            # Audio module tests
│   ├── test_config.py           # Configuration tests
│   ├── test_detection.py        # Detection module tests
│   ├── test_evidence.py         # Evidence module tests
│   ├── test_export.py           # Data export tests
│   ├── test_fingerprint_clustering.py  # Clustering tests
│   ├── test_fingerprint_matching.py    # Fingerprint matching tests
│   ├── test_quiet_hours.py      # Quiet hours tests
│   ├── test_resample_cache.py   # Resample cache tests
│   ├── test_summary.py          # Summary endpoint tests
│   └── test_yamnet.py           # YAMNet gate tests
│
├── pyproject.toml               # Python packaging (PEP 517/518)
├── Dockerfile                   # Multi-stage Docker build
├── docker-compose.yml           # Docker Compose deployment
├── config.yaml                  # Default configuration
├── .env.example                 # Environment variable template
└── README.md                    # This file

Module Documentation

`config.py` - Configuration System

Pattern: Pydantic v2 with proper nesting (BaseModel for nested, BaseSettings for root only)

# Nested configs use BaseModel (NOT BaseSettings)
class AudioConfig(BaseModel):
    device_name: str | None = None
    sample_rate: int = 44100
    channels: int = 2
    # ...

# Only root uses BaseSettings
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_prefix="WOOFALYTICS__",
        env_nested_delimiter="__",
    )
    audio: AudioConfig = Field(default_factory=AudioConfig)
    # ...

Environment Variables:

Prefix: WOOFALYTICS__
Nested delimiter: __
Example: WOOFALYTICS__AUDIO__SAMPLE_RATE=48000

`audio/devices.py` - Microphone Discovery

MicrophoneInfo - Dataclass for device info
list_microphones(min_channels) - List all input devices
find_microphone(device_name, min_channels) - Auto-detect or filter by name
set_microphone_volume(percent) - ALSA amixer wrapper (Linux only)

`audio/capture.py` - Async Audio Capture

AudioFrame - Single frame with timestamp, raw bytes, metadata
AsyncAudioCapture - Runs PyAudio in background thread, async interface
- Ring buffer (default 30 seconds)
- get_recent_frames(count) - Get N most recent frames
- get_buffer_as_array(seconds) - Get audio as numpy array

`detection/features.py` - Feature Extraction

FeatureExtractor - Converts audio to Mel filterbank features
- Resamples from source rate (44.1kHz) to model rate (16kHz)
- 80 Mel bins, 25ms frame, 10ms hop
- Uses torchaudio.compliance.kaldi.fbank for Kaldi compatibility
- Output: (1, 480) tensor (6 frames × 80 mels)

`detection/yamnet.py` - YAMNet Pre-filter Gate

YAMNetGate - TensorFlow-based pre-filter (~3.7M params)
- Uses Google's YAMNet to detect dog/bark audio classes
- Skips expensive CLAP inference for non-dog sounds
- Falls back to CLAP-only if TensorFlow fails to load

`detection/resample_cache.py` - Cached Resampling

Caches resampled audio to avoid redundant computation across pipeline stages

`detection/doa.py` - Direction of Arrival

DirectionEstimator - Estimates sound direction using ULA or UCA geometry
- Bartlett - Simple beamforming (default)
- Capon (MVDR) - Higher resolution
- MEM - Maximum entropy, best for close sources
angle_to_direction(angle) - Converts degrees to compass directions

`detection/clap.py` - CLAP Zero-Shot Classifier (Primary)

CLAPConfig - Configuration for CLAP detection
- bark_labels - Positive bark sound labels
- speech_labels - Human speech for veto
- percussive_labels - Claps, knocks for veto
- bird_labels - Bird sounds for veto
- threshold, speech_veto_threshold, bird_veto_threshold
- rolling_window_size, detection_cooldown_frames
CLAPDetector - Zero-shot audio classifier using LAION CLAP
- Uses laion/clap-htsat-unfused model by default
- Caches text embeddings for efficiency
- Multi-label detection with veto system
- Rolling window smoothing with high-confidence bypass
- Detection cooldown to prevent rapid-fire triggers

`detection/vad.py` - Voice Activity Detection Gate

VADConfig - Configuration for VAD gate
VADGate - Fast energy-based rejection of silent audio
- Skips expensive CLAP inference on silent frames
- Configurable energy threshold in dB

`detection/model.py` - Bark Detector Orchestrator

BarkEvent - Detection event with timestamp, probability, DOA
BarkDetector - Main orchestrator
- Supports both CLAP (default) and legacy MLP modes
- CLAP mode: 500ms inference interval with 1s audio windows
- Legacy mode: 80ms inference interval with TorchScript
- Manages callbacks for event notification
- Tracks statistics (uptime, total barks, VAD skips)

`evidence/metadata.py` - Metadata Models

DetectionInfo - Probability, bark count, DOA values
DeviceInfo - Hostname, microphone name
EvidenceMetadata - Complete metadata for a recording
EvidenceIndex - Index of all evidence files

`evidence/storage.py` - Evidence Storage

EvidenceStorage - Records bark clips
- Triggers on bark detection
- Records past context (15s) + future context (15s)
- Saves WAV + JSON sidecar
- Maintains searchable index

`events/manager.py` - Notification Manager

NotificationManager - Orchestrates bark alert notifications
- Integrates quiet hours, debouncing, and webhook delivery
- Runs webhook calls in a thread pool to avoid blocking

`events/debouncer.py` - Notification Debouncing

Per-dog rate limiting to prevent notification spam
Configurable debounce window (default 5 minutes)

`events/webhook.py` - Webhook Delivery

IFTTT Maker Webhooks and custom HTTPS webhook support
SSRF protection (blocks private IPs and internal hostnames)
Retry with configurable timeout

`fingerprint/extractor.py` - Feature Extraction

Extracts CLAP embeddings and acoustic features from bark audio

`fingerprint/acoustic_features.py` - Acoustic Feature Computation

Computes spectral centroid, bandwidth, rolloff, and other acoustic features for bark characterization

`fingerprint/acoustic_matcher.py` - Acoustic Similarity

Weighted acoustic feature similarity for dog matching

`fingerprint/clustering.py` - Bark Clustering

HDBSCAN-based clustering of untagged bark fingerprints for discovering new dogs

`observability/metrics.py` - Prometheus Metrics

Prometheus-compatible metrics endpoint (/api/metrics)
Tracks bark counts, inference latency, VAD/YAMNet skip rates, evidence storage

`api/auth.py` - Authentication

Optional API key authentication via X-API-Key header
Configurable via server.api_key or WOOFALYTICS__SERVER__API_KEY

`api/ratelimit.py` - Rate Limiting

Per-endpoint rate limiting using slowapi
Configurable limits for read, write, download, and WebSocket operations

`api/routes.py` - Core REST Endpoints

See API Reference below.

`api/websocket.py` - WebSocket Streaming

ConnectionManager - Manages active WebSocket connections
/ws/bark - Real-time bark events
/ws/pipeline - Detection pipeline state at 10Hz (VAD/YAMNet/CLAP stages, stats)

`app.py` - FastAPI Application

Uses lifespan context manager for startup/shutdown
Dependency injection via app.state
Mounts static files, includes routers

Configuration System

config.yaml

audio:
  device_name: null        # null = auto-detect, or specific name e.g. "pulse"
  sample_rate: 44100       # Hz
  channels: 2              # Minimum 2 for DOA (use 4 for circular arrays)
  chunk_size: 441          # Samples per chunk (~10ms at 44.1kHz)
  volume_percent: 75       # Microphone gain (0-100)

model:
  use_clap: true           # Use CLAP zero-shot (recommended)
  clap_model: laion/clap-htsat-unfused
  clap_threshold: 0.6      # Bark confidence threshold (0.0-1.0)
  clap_bird_veto_threshold: 0.15  # Bird veto threshold (lower = more aggressive)
  clap_min_harmonic_ratio: 0.1    # Minimum harmonic ratio (0 to disable)
  clap_device: cpu         # or cuda
  vad_enabled: true        # Fast rejection of silent audio
  vad_threshold_db: -40    # Energy threshold for VAD (dBFS)
  yamnet_enabled: true     # YAMNet pre-filter (skips CLAP on non-dog audio)
  yamnet_threshold: 0.05   # YAMNet dog probability threshold (kept low)
  # Legacy MLP settings (when use_clap: false)
  path: ./models/traced_model.pt
  target_sample_rate: 16000
  threshold: 0.88

doa:
  enabled: true
  array_type: ula          # 'ula' (linear) or 'uca' (circular)
  element_spacing: 0.1     # Inter-element spacing in wavelengths (ULA)
  radius: 0.1              # Array radius in wavelengths (UCA, ~0.093 for ReSpeaker 4-Mic)
  num_elements: 2          # Number of microphone elements
  angle_min: 0
  angle_max: 180           # Use 360 for UCA
  method: bartlett          # 'bartlett', 'capon', or 'mem'

evidence:
  directory: ./evidence
  past_context_seconds: 15
  future_context_seconds: 15

notification:
  enabled: false           # Enable notification system

webhook:
  enabled: false
  ifttt_event: woof
  # ifttt_key: set via environment
  debounce_seconds: 300    # Min seconds between notifications per dog

quiet_hours:
  enabled: false
  start: "22:00"           # Quiet period start (HH:MM)
  end: "06:00"             # Quiet period end (HH:MM)
  threshold: 0.9           # Higher threshold during quiet hours
  notifications: false     # Suppress notifications during quiet hours
  timezone: UTC            # IANA timezone (e.g. 'Australia/Sydney')

server:
  host: 127.0.0.1          # Localhost only by default (use 0.0.0.0 for network access)
  port: 8000
  api_key: null            # Set for API authentication (generate with: python -c 'import secrets; print(secrets.token_hex(16))')
  rate_limit:
    enabled: true
    read_limit: "120/minute"
    write_limit: "30/minute"

log_level: INFO            # DEBUG, INFO, WARNING, ERROR
log_format: console        # console or json
debug: false               # Enable debug diagnostics

Environment Variables

# Override any config value
WOOFALYTICS__LOG_LEVEL=DEBUG
WOOFALYTICS__MODEL__THRESHOLD=0.90
WOOFALYTICS__AUDIO__DEVICE_NAME=ReSpeaker
WOOFALYTICS__WEBHOOK__IFTTT_KEY=your_secret_key

AI Summaries (Ollama)

The /api/summary/weekly/ai and /api/summary/ai endpoints generate natural-language bark reports using a local LLM via Ollama. This is entirely optional -- all other summary endpoints work without it.

Setup:

# Install Ollama (https://ollama.com/download)
curl -fsSL https://ollama.com/install.sh | sh

# Pull the default model
ollama pull qwen2.5:3b

Environment variables:

Variable	Default	Description
`OLLAMA_URL`	`http://localhost:11434`	Ollama API base URL
`OLLAMA_MODEL`	`qwen2.5:3b`	Model to use for generation

Hardware note: The default qwen2.5:3b model requires ~2GB RAM (Q4 quantized). Since woofalytics itself already uses significant RAM for CLAP + YAMNet, you'll want at least 8GB total if running Ollama on the same machine. Summaries are generated on-demand so generation speed isn't critical -- a few seconds on a modern x86 CPU is typical. You can also point OLLAMA_URL at a remote Ollama instance to offload generation entirely.

If Ollama is not running, the AI summary endpoints return a 503 error; all other functionality is unaffected.

API Reference

Health & Status

Endpoint	Method	Description
`/api/health`	GET	Health check with uptime, bark count, evidence count
`/api/status`	GET	Detector status (running, uptime, last event, gate stats)
`/api/config`	GET	Current configuration (sanitized, no secrets)
`/api/metrics`	GET	Prometheus-format metrics

Bark Detection

Endpoint	Method	Description
`/api/bark`	GET	Latest bark event
`/api/bark/probability`	GET	Just the probability value
`/api/bark/recent?count=10`	GET	Recent events (1-100)
`/api/direction`	GET	Current DOA with all methods

Evidence

Endpoint	Method	Description
`/api/evidence?count=20`	GET	List recent evidence
`/api/evidence/stats`	GET	Storage statistics
`/api/evidence/{filename}`	GET	Download WAV or JSON file
`/api/evidence/date/{YYYY-MM-DD}`	GET	Evidence by date
`/api/evidence/purge`	POST	Purge evidence older than N days

Dog Profiles & Fingerprints

Endpoint	Method	Description
`/api/dogs`	GET	List all dog profiles
`/api/dogs`	POST	Create a new dog profile
`/api/dogs/{id}`	GET	Get dog profile
`/api/dogs/{id}`	PUT	Update dog profile
`/api/dogs/{id}`	DELETE	Delete dog profile
`/api/dogs/{id}/barks`	GET	Get barks for a specific dog
`/api/dogs/{id}/confirm`	POST	Confirm a dog profile
`/api/dogs/{id}/unconfirm`	POST	Unconfirm a dog profile
`/api/dogs/{id}/reset-embedding`	POST	Reset dog's embedding
`/api/dogs/merge`	POST	Merge two dog profiles
`/api/fingerprints`	GET	List fingerprints (with filtering)
`/api/fingerprints/aggregates`	GET	Fingerprint aggregate stats
`/api/fingerprints/stats`	GET	Fingerprint system statistics
`/api/fingerprints/{id}`	DELETE	Delete a fingerprint
`/api/fingerprints/purge`	POST	Purge fingerprints older than N days
`/api/fingerprints/purge-without-evidence`	POST	Remove orphaned fingerprints
`/api/fingerprints/recalculate-bark-counts`	POST	Recalculate bark counts

Bark Tagging

Endpoint	Method	Description
`/api/barks/untagged`	GET	List untagged barks
`/api/barks/{id}/tag`	POST	Tag a bark to a dog
`/api/barks/bulk-tag`	POST	Bulk tag multiple barks
`/api/barks/{id}/correct`	POST	Correct a bark's dog assignment
`/api/barks/{id}/untag`	POST	Remove a bark's tag
`/api/barks/{id}/reject`	POST	Mark a bark as false positive
`/api/barks/{id}/unreject`	POST	Un-reject a bark
`/api/barks/{id}/confirm`	POST	Confirm a bark detection
`/api/barks/{id}/unconfirm`	POST	Unconfirm a bark detection
`/api/barks/cluster`	POST	Cluster untagged barks (HDBSCAN)
`/api/barks/cluster/{id}/create-dog`	POST	Create dog from cluster

Summaries & Export

Endpoint	Method	Description
`/api/summary/daily`	GET	Daily bark summary
`/api/summary/weekly`	GET	Weekly bark summary
`/api/summary/monthly`	GET	Monthly bark summary
`/api/summary/range`	GET	Custom date range summary
`/api/summary/weekly/ai`	GET	AI-generated weekly summary (Ollama)
`/api/summary/ai`	GET	AI-generated range summary (Ollama)
`/api/export/json`	GET	Export bark data as JSON
`/api/export/csv`	GET	Export bark data as CSV
`/api/export/stats`	GET	Export statistics

Settings & Notifications

Endpoint	Method	Description
`/api/settings`	GET	Get all runtime settings
`/api/settings`	PUT	Update runtime settings (persisted to config.yaml)
`/api/notifications/status`	GET	Notification system status

WebSocket

Endpoint	Description
`/ws/bark`	Real-time bark events (JSON)
`/ws/pipeline`	Detection pipeline state at 10Hz (VAD/YAMNet/CLAP stages)

OpenAPI Documentation

Swagger UI: /api/docs
ReDoc: /api/redoc
OpenAPI JSON: /api/openapi.json

Web UI

The frontend is a SvelteKit SPA with a NASA Mission Control-inspired theme (glassmorphism, dark UI, cyan/amber accents).

Pages

Route	Description
`/`	Dashboard - Real-time bark probability, detection pipeline monitor, dog overview with last heard timestamps, persistent statistics
`/dogs`	Dog Management - View registered dogs, bark counts, last heard indicators, bark modal with reassign/untag/delete actions
`/fingerprints`	Fingerprints Explorer - Browse bark fingerprints with filtering, playback, and clustering analysis
`/reports`	Reports - Bark activity reports and trend analysis
`/settings`	Settings & Maintenance - Detection parameters, quiet hours, webhooks, fingerprint purge

Features

Real-time Updates - WebSocket streams for live bark events and audio levels
Type-safe API Client - Generated from OpenAPI schema using openapi-fetch
Svelte 5 Runes - Modern reactive state with $state, $derived, $effect
Responsive Design - Works on desktop and tablet
Evidence Playback - Listen to recorded bark clips directly in the browser
Bark Management Modal - View dog's barks with reassign, untag, and delete controls
Last Heard Indicators - Teal audio icon showing when each dog was last detected
Clustering UI - Visual bark clustering for pattern analysis and dog profile creation
Persistent Dashboard Stats - Bark counts survive page refreshes via API persistence
Toast Notifications - Non-blocking feedback replacing browser alerts
Active Navigation - Clear indication of current page with amber highlight
Accessibility - Targets WCAG AA text contrast, labeled form inputs, prefers-reduced-motion support

Production Serving

The SvelteKit frontend is built to static files and served directly by FastAPI. No separate Node.js server required in production.

Hardware Requirements

Minimum

Python 3.11+ with a working PyAudio/PortAudio installation
2GB+ RAM (CLAP + YAMNet models need memory)
Any microphone (1+ channels; 2+ for DOA)

Recommended for DOA

ReSpeaker 2-Mic HAT (~$12) - HAT form factor, 2 mics
ReSpeaker 4-Mic Array (~$35) - 360° coverage, use array_type: uca

ReSpeaker HAT Setup

# Install seeed-voicecard driver
git clone https://github.com/respeaker/seeed-voicecard
cd seeed-voicecard
sudo ./install.sh
sudo reboot

Installation

Quick Start (Docker)

git clone https://github.com/machug/woofalytics-v2.git
cd woofalytics-v2
cp .env.example .env
docker-compose up -d

Manual Installation

# System dependencies (Debian/Ubuntu)
sudo apt-get update
sudo apt-get install -y \
    python3.11 python3.11-venv \
    portaudio19-dev libasound2-dev \
    alsa-utils nodejs npm

# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# Install Python package
pip install -e .

# Build frontend
cd frontend
npm install
npm run build
cd ..

# Verify audio devices
woofalytics --list-devices

# Run
woofalytics

CLI Options

woofalytics [OPTIONS]

Options:
  -c, --config PATH       Config file (default: config.yaml)
  --host TEXT             Override host
  -p, --port INTEGER      Override port
  --reload                Enable hot reload (dev)
  --log-level LEVEL       Override log level
  --list-devices          List audio devices and exit
  --version               Show version

Docker Deployment

Dockerfile Features

Multi-stage build (builder + runtime)
Non-root user (woofalytics)
Audio libraries pre-installed
Health check included
Evidence volume for persistence

docker-compose.yml

services:
  woofalytics:
    build: .
    container_name: woofalytics
    ports:
      - "8000:8000"
    devices:
      - /dev/snd:/dev/snd    # Audio device access
    group_add:
      - audio                 # Audio group membership
    volumes:
      - ./config.yaml:/home/woofalytics/app/config.yaml:ro
      - ./evidence:/home/woofalytics/app/evidence
      - ./models:/home/woofalytics/app/models:ro
    environment:
      - TZ=Europe/London
      - WOOFALYTICS__WEBHOOK__IFTTT_KEY=${IFTTT_KEY:-}
      - WOOFALYTICS__LOG_LEVEL=${LOG_LEVEL:-INFO}
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 1G
        reservations:
          memory: 512M
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s

Commands

# Build and start
docker-compose up -d --build

# View logs
docker-compose logs -f

# Stop
docker-compose down

# Rebuild after code changes
docker-compose up -d --build --force-recreate

Development

Setup

# Install with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (optional)
pre-commit install

# Install frontend dependencies
cd frontend && npm install && cd ..

Running (Backend)

# With hot reload
woofalytics --reload --log-level DEBUG

# Or directly with uvicorn
uvicorn woofalytics.app:app --reload --host 0.0.0.0 --port 8000

Running (Frontend Development)

# Start the SvelteKit dev server (auto-proxies API calls to backend)
cd frontend
npm run dev

# Frontend available at http://localhost:5173
# Backend must be running on port 8000

Building Frontend for Production

cd frontend
npm run build    # Outputs to frontend/build/
npm run preview  # Preview production build locally

Code Quality

# Python linting
ruff check src/woofalytics

# Python type checking
mypy src/woofalytics

# Python format
ruff format src/woofalytics

# Frontend type checking
cd frontend && npm run check

Testing

Run Tests

# All tests
pytest

# With coverage
pytest --cov=woofalytics --cov-report=html

# Specific module
pytest tests/test_config.py -v

# With output
pytest -s

Test Structure

conftest.py - Shared fixtures (mock PyAudio, test settings, etc.)
test_api_routes.py - API endpoint tests
test_api_websocket.py - WebSocket tests
test_audio.py - Audio frame and device tests
test_config.py - Configuration validation
test_detection.py - DOA and bark event tests
test_evidence.py - Metadata and storage tests
test_export.py - Data export tests
test_fingerprint_clustering.py - Bark clustering tests
test_fingerprint_matching.py - Fingerprint matching tests
test_quiet_hours.py - Quiet hours scheduling tests
test_resample_cache.py - Resample cache tests
test_summary.py - Summary endpoint tests
test_yamnet.py - YAMNet gate tests

Mocking

Tests mock PyAudio to run without audio hardware:

@pytest.fixture
def mock_pyaudio():
    with patch("pyaudio.PyAudio") as mock:
        # Configure mock device list
        yield mock

Design Decisions

Why Pydantic v2 with BaseModel for Nested Configs?

Using BaseSettings for nested configs causes environment variable conflicts. The correct pattern:

BaseModel for nested configs (AudioConfig, ModelConfig, etc.)
BaseSettings only for root Settings class
Environment variables work with __ delimiter: WOOFALYTICS__AUDIO__SAMPLE_RATE

Why Async Audio Capture?

PyAudio is blocking, but FastAPI is async. Solution:

Run PyAudio in a background daemon thread
Use thread-safe ring buffer (deque with lock)
Async methods for control (start(), stop())
Sync methods for buffer access (called from any context)

Why Three DOA Algorithms?

Each has trade-offs:

Bartlett - Robust, works well with noise
Capon - Better resolution, more sensitive to calibration
MEM - Best for multiple sources, computationally heavier

Why CLAP Instead of Custom Models?

CLAP (Contrastive Language-Audio Pretraining) offers key advantages:

Zero-shot - No training data required, works immediately
Multi-label - Can detect bark AND check for speech/birds simultaneously
Veto system - Reduces false positives by rejecting similar sounds
Generalizes - Works across dog breeds without fine-tuning

The downside is slower inference (~500ms vs 80ms), which is why:

VAD gate fast-rejects silent audio before CLAP
High-confidence bypass (≥80%) enables instant detection
Detection cooldown prevents rapid-fire from same sound

Why Legacy MLP Mode?

For constrained hardware or faster inference, the legacy MLP model offers:

80ms inference interval (12.5 inferences/second)
Smaller memory footprint
Less accurate but faster

Why JSON Sidecars for Evidence?

For documentation purposes, metadata must be:

Human-readable (JSON, not binary)
Separate from audio (can't be embedded in WAV easily)
Include precise timestamps, probabilities, device info
Machine-parseable for cataloging and fingerprinting

Known Issues & TODOs

Not Yet Implemented

Evidence Cleanup - Automatic old file removal (manual purge available via API)
Audio Spectrogram - Visual display in web UI

Potential Improvements

Home Assistant Integration - MQTT or REST
SMS/Push Notifications - Via Pushover/Twilio

Recently Implemented (v2.5.0)

Webhook Notifications - Configurable webhooks for bark alerts
Multi-Dog Fingerprinting - Identify individual dogs by bark signature
Bark Pattern Analysis - Clustering UI for analyzing bark patterns
Quiet Hours - Scheduled reduced sensitivity periods
Fingerprint Purge - Remove orphaned fingerprints without audio evidence
Notification Debouncing - Per-dog rate limiting via events/debouncer.py
Prometheus Metrics - Prometheus-format metrics at /api/metrics
API Authentication - Optional API key authentication
Rate Limiting - Per-endpoint rate limiting
Runtime Settings - Update settings via UI, persisted to config.yaml

Known Limitations

ALSA Volume Control - Microphone volume adjustment (volume_percent) uses ALSA and is Linux-specific; detection works on any OS with PyAudio
CPU Only - Inference is CPU-only (GPU not required)

Original Project

This is a fork/rewrite of the original woofalytics project. Key changes:

Aspect	Original	v2.5
Python	3.9+	3.11+
Detection	Custom MLP	CLAP zero-shot (+ legacy MLP)
False Positives	High	Multi-layer veto system
Web Framework	Basic HTTP	FastAPI
Config	Hardcoded	Pydantic v2
Microphone	Andrea only	Any USB mic
Real-time	Polling	WebSocket
Evidence	WAV only	WAV + JSON metadata
Deployment	Manual	Docker
Tests	None	pytest suite

Versioning

Version is tracked in the VERSION file at the repository root. See CHANGELOG.md for release history.

License

MIT License - See original project for attribution.

Contributing

Fork the repository
Create a feature branch
Run tests: pytest
Run linting: ruff check src/
Submit a pull request

Quick Reference

# Start the server
woofalytics

# List audio devices
woofalytics --list-devices

# Run with debug logging
woofalytics --log-level DEBUG

# Docker
docker-compose up -d

# Run tests
pytest

# Check API docs
open http://localhost:8000/api/docs

Name		Name	Last commit message	Last commit date
Latest commit History 172 Commits
frontend		frontend
models		models
src/woofalytics		src/woofalytics
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
README.md		README.md
VERSION		VERSION
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation