Real-time anomaly detection platform for professional football. Transforms high-frequency telemetry (GPS, HR, accelerometry) into actionable coaching decisions using a shared-backbone LSTM autoencoder, regime-aware per-player calibration, and a multi-layer explainability suite β all engineered to maintain a < 200 ms inference SLA under distributed failure conditions.
- System Architecture
- Event Lifecycle
- Tech Stack
- File Map
- Quick Start
- Installation
- Configuration
- CLI Reference
- Data Schema
- ML Pipeline
- Explainability (XAI)
- Temporal State Compression
- Cache-Augmented Generation (Redis CAG)
- Reliability & Hardening
- Replay Consistency Guarantees
- Fairness & Recalibration
- Logging & Observability
- Exit Codes
- Known Limitations & Roadmap
- References
Telemetry Stream (GPS/REST/WS/MQTT)
β
βΌ
βββββββββββββββββββββββββββ
β Ingestion Layer β GPS NMEA Β· SportRadar REST Β· WebSocket Β· MQTT (QoS 1)
ββββββββββββββ¬βββββββββββββ
β RawPlayerObservation
βΌ
βββββββββββββββββββββββββββ
β Pre-Accumulation ββββ [Reject timestamp reversals]
β Temporal Guard ββββ [Detect epoch discontinuities]
ββββββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β LiveWindowAccumulator β Per-player ring buffer; stride = window_size
β (24-event windows) β Emits one non-overlapping window per 24 events,
ββββββββββββββ¬βββββββββββββ reducing overlap-induced persistence amplification
β
βΌ
βββββββββββββββββββββββββββ
β Post-Window TVL ββββ [Physical plausibility validation]
β Semantic Validation β VALID Β· DEGRADED Β· INVALID
ββββββββββββββ¬βββββββββββββ
β List[dict] window
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Pattern Analysis Engine β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SharedBackboneAutoencoder (LSTM + FiLM) β β
β β Β· Shared encoder across all players β β
β β Β· Per-player FiLM conditioning embeddings β β
β β Β· Per-player normaliser (Β΅/Ο per feature) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β RegimeAwareThresholdStore β β
β β 9 regimes: Territory(3) Γ Intensity(3) β β
β β Β· Per-regime DynamicThresholdTracker β β
β β Β· Fallback to global tracker if under-cal β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Auxiliary Detectors β β
β β Β· FatigueCurveAnalyzer (speed decay fit) β β
β β Β· PositionalDriftAnalyzer (GPS centroid) β β
β β Β· WorkloadTrendTracker (ACWR 0.8β1.5) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β AnomalyResult
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Explainability Suite (XAI) β
β Β· Temporal Feature Ablation (F+2 model calls) β
β Β· SHAP KernelExplainer (if shap installed) β
β Β· SemanticInterpreter (symbolic reasoning) β
β Β· LLMNLGEngine (Qwen2.5:14b) ββ TemplateNLGEngine β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β SemanticFindings + SHAP attributions
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Redis CAG Layer β βββ Cache-Augmented Generation
β Β· Per-player SHAP attribution cache β
β Β· SemanticFinding history (sorted sets, TTL-gated) β
β Β· Augments SemanticInterpreter with cached context β
β without re-running SHAP over past windows β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β Augmented findings
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Temporal State Compression β
β Β· Trajectory narrative builder β
β Β· Escalation summary encoder β
β Β· Episodic abstraction (episode_id-scoped) β
β Compresses finding stream β structured LLM prompt β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β Compressed state + SHAPExplanation
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Alert FSM (AlertManager) β
β NONE β WARNING β SUSTAINED β CRITICAL β
β HOLD (telemetry blackout) β
β SAFE_MODE (system-wide scientific invalidation) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β Recommendation + NDJSON alert
βΌ
Coach Dashboard / stdout
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Feedback & Recalibration Loop β
β Β· Coach override logging (OverrideRecord) β
β Β· FairnessMonitor (position Β· age_group Β· nation.) β
β Β· RecalibrationPipeline (7-day cadence) β
β Β· MutationJournal (versioned threshold audit) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A single telemetry event passes through ten distinct processing stages before reaching the coach. This diagram provides the mental model for navigating the codebase.
Raw Telemetry Event (GPS Β· HR Β· accelerometry)
β
β player_external_id, ts, speed_ms, heart_rate_bpm, β¦
βΌ
βββββββββββββββββββββββββ
β 1. Validity Gate β Pre-accumulation timestamp guard
β (TVL) β Epoch discontinuity β buffer reset
ββββββββββββ¬βββββββββββββ INVALID β dropped | DEGRADED β flagged
β
βΌ
βββββββββββββββββββββββββ
β 2. Sequence Window β LiveWindowAccumulator ring buffer
β (24-event stride) β Emits one window per 24 raw packets
ββββββββββββ¬βββββββββββββ Post-window plausibility re-check (TVL)
β
βΌ
βββββββββββββββββββββββββ
β 3. Shared Model β SharedBackboneAutoencoder
β (LSTM + FiLM) β Regime-routed threshold comparison
ββββββββββββ¬βββββββββββββ EMA-smoothed anomaly score
β
βΌ
βββββββββββββββββββββββββ
β 4. Attribution β Temporal Feature Ablation (F+2 calls)
β (SHAP / Ablation) β SHAP KernelExplainer when available
ββββββββββββ¬βββββββββββββ Magnitude-proxy fallback (shap_compat)
β
βΌ
βββββββββββββββββββββββββ
β 5. Semantic Findings β SemanticInterpreter
β β SHAP weights β typed SemanticFinding
ββββββββββββ¬βββββββββββββ Domains: cardiovascular Β· locomotor Β·
β workload Β· tactical Β· persistence
βΌ
βββββββββββββββββββββββββ
β 6. Redis CAG β Augment current findings with
β (Context Cache) β cached SHAP history + prior findings
ββββββββββββ¬βββββββββββββ Deterministic, zero-retrieval-latency
β per-player longitudinal context
βΌ
βββββββββββββββββββββββββ
β 7. State Compression β MatchStateManager
β β Finding stream β trajectory narrative
ββββββββββββ¬βββββββββββββ Motif detection Β· escalation summary Β·
β episodic abstraction (episode_id-scoped)
βΌ
βββββββββββββββββββββββββ
β 8. Policy Engine β AlertManager FSM
β (Alert FSM) β Hysteresis Β· cooldown Β· Safe Mode
ββββββββββββ¬βββββββββββββ Recommendation priority ladder
β
βΌ
βββββββββββββββββββββββββ
β 9. NLG Layer β LLMNLGEngine (Qwen2.5:14b, async)
β β Receives compressed state only β
ββββββββββββ¬βββββββββββββ not raw telemetry or full history
β TemplateNLGEngine fallback (<1 ms)
βΌ
βββββββββββββββββββββββββ
β 10. Coach Dashboard β NDJSON alert β stdout
β β nlg_summary Β· top_features Β·
βββββββββββββββββββββββββ counterfactual Β· latency_ms
SLA boundary: The 200 ms clock runs from stage 2 (window emission) through stage 8 (alert FSM output). Stages 9β10 are asynchronous and off the SLA clock.
| Library / Model | Version | Role |
|---|---|---|
PyTorch (torch, torch.nn, torch.optim, DataLoader) |
β₯ 2.0 | Shared LSTM backbone, Transformer AE, batch training, checkpoint serialisation |
| scikit-learn | β₯ 1.3 | ROC-AUC / PR-AUC / precision@k evaluation; KMeans background summarisation for SHAP |
SciPy (stats.zscore, optimize.curve_fit, integrate.trapezoid) |
β₯ 1.11 | Z-score baselines, exponential fatigue curve fitting, trapezoid distance integration |
SHAP (KernelExplainer, shap.kmeans) |
β₯ 0.42 | Feature-level attribution (graceful magnitude-proxy fallback when unavailable) |
| Qwen2.5:14b via Ollama | Local HTTP | LLM NLG coaching summaries; configurable timeout, deterministic template fallback |
| Library | Role |
|---|---|
| NumPy | Sequence windowing, proxy SHAP computation, batch array ops |
| Pandas | CSV I/O, timestamp parsing/coercion, rolling baseline aggregation |
| Library | Protocol | Role |
|---|---|---|
| aiohttp | HTTP / REST | SportRadar / Opta API polling adapter; exponential-backoff retry |
| websockets | WebSocket | Live match event stream adapter |
asyncio-mqtt (aiomqtt) |
MQTT (QoS 1) | Wearable sensor bridge (HR, accelerometry) |
| pynmea2 | NMEA 0183 | GPS sentence parsing from serial port or TCP/gpsd |
| asyncio | β | Single-event-loop async I/O for all ingestion adapters |
| Component | Notes |
|---|---|
| PostgreSQL | Primary store; psycopg2 for sync ORM, asyncpg for async paths |
| SQLAlchemy | ORM models β Player, Session, PlayerEvent, audit logs |
| Redis | CAG backing store; per-player SHAP attribution cache and SemanticFinding history (sorted sets, TTL-gated); deterministic context augmentation without retrieval latency |
| Module | Usage |
|---|---|
argparse |
Five-subcommand CLI with typed arguments and defaults |
logging |
Structured logging; JSON formatter enabled by JSON_LOGS=1 |
hashlib |
Event fingerprinting for exactly-once semantics |
threading |
Lock for HardenedRollingThresholdStore thread safety |
collections.deque |
LiveWindowAccumulator per-player ring buffers |
dataclasses |
All domain objects (AnomalyResult, SHAPExplanation, WindowRegime, etc.) |
time.monotonic |
Alert cooldown gate; SLA latency measurement |
| Library | Behaviour when absent |
|---|---|
shap |
Falls back to shap_compat.py magnitude-proxy attribution |
torch |
Stub mode β no inference, pipeline still importable |
sklearn |
ROC-AUC / PR-AUC disabled; evaluate exits 3 |
tqdm |
Progress bars replaced with logger.info() calls |
pynmea2 |
GPS serial/TCP adapter disabled; REST + WS still work |
aiohttp |
REST polling adapter disabled |
redis |
CAG disabled; SemanticInterpreter operates without cached history |
| File | Class / Entry Point | Responsibility |
|---|---|---|
main.py |
main() |
Production CLI entrypoint (generate Β· train Β· evaluate Β· serve Β· audit) |
analysis/orchestrator.py |
PlayersDataAnalysisPipeline |
Wires ingestion β TVL β ML β XAI β CAG β compression β FSM β feedback; match lifecycle |
analysis/anomaly_detection.py |
SharedBackboneAutoencoder, PatternAnalysisEngine |
LSTM AE training + inference, threshold calibration, positional drift |
analysis/baseline.py |
BaselineBuilder, PlayerBaselineProfile |
28-day rolling baselines, fatigue curve fitting, ACWR tracking |
analysis/regime.py |
SessionRegimeClassifier, RegimeAwareThresholdStore |
9-regime (Territory Γ Intensity) window classification and threshold routing |
analysis/match_state.py |
MatchStateManager, SemanticMatchState |
Longitudinal match memory, motif detection, trend reasoning, state compression |
analysis/live_window_accumulator.py |
LiveWindowAccumulator |
Per-player ring buffer; emits fixed-stride inference windows |
analysis/telemetry_validity.py |
TelemetryValidityLayer |
Physical plausibility gate (VALID / DEGRADED / INVALID); replay-aware timestamp validation |
| File | Class | Responsibility |
|---|---|---|
explainability/xai_layer.py |
XAILayer, LLMNLGEngine, TemplateNLGEngine |
Temporal feature ablation, SHAP routing, Qwen2.5:14b NLG |
explainability/semantics_layer.py |
SemanticInterpreter |
Symbolic physiological reasoning β cardiovascular, locomotor, workload, tactical |
explainability/shap_compat.py |
compute_shap_values, build_kmeans_background |
SHAP with magnitude-proxy fallback; background deduplication guard |
explainability/episodic_context.py |
TemporalContextCompressor, CompressedTemporalContext, PlayerEpisode, TacticalEpisode |
Compresses SemanticFinding streams into trajectory narratives, escalation summaries, and episodic abstractions before LLM conditioning |
| File | Class | Responsibility |
|---|---|---|
cag/redis_client.py |
RedisCheckpointStore, EpisodeStore |
Per-player SHAP attribution cache and SemanticFinding history; sorted-set TTL management |
cag/redis_client.py |
RedisPubSubClient, RedisConnectionPool |
Pub/sub event streaming; connection pool management |
| File | Class | Responsibility |
|---|---|---|
utils/reliability/invariants.py |
SystemInvariantGuard |
Machine-enforced system invariants; triggers graded Safe Mode |
utils/reliability/safe_mode.py |
SafeModeController |
Four-level degradation: NORMAL β LEVEL_1 β LEVEL_2 β LEVEL_3 |
utils/reliability/determinism.py |
MutationJournal, TemporalCausalityGuard |
Versioned calibration log, strict event-time monotonicity |
utils/reliability/calibration_store.py |
HardenedRollingThresholdStore |
Quarantine buffers, drift monitoring, thread-safe threshold store |
utils/reliability/adaptation_engine.py |
DeterministicCalibrationManager |
Crash-safe, versioned calibration updates |
utils/reliability/queue_manager.py |
BoundedPriorityQueue |
Priority-aware backpressure; sheds LLM tasks before SHAP before inference |
| File | Class | Responsibility |
|---|---|---|
ingestion/pipeline.py |
GPSIngestionAdapter, SportRadarAPIAdapter, IngestionPipeline |
NMEA/TCP GPS, REST polling, WebSocket events, MQTT sensor bridge |
config/settings.py |
PlayersDataConfig |
All configuration via environment variables and typed dataclasses |
config/ollama_client.py |
OllamaClient |
Async HTTP wrapper for Qwen2.5:14b; response caching, timeout guard |
utils/schema.py |
ORM models | SQLAlchemy models for Player, Session, PlayerEvent, audit log |
utils/ema.py |
EMASmoother |
Exponential moving average for anomaly score smoothing (Ξ± = 0.25) |
utils/alert_manager.py |
AlertManager |
Deterministic FSM with hysteresis, cooldown gate, Safe Mode propagation |
utils/evaluation/episodes.py |
extract_episodes, match_episodes |
Binary β episode conversion; TP/FP/FN at episode level |
| File | Responsibility |
|---|---|
data/data_generator.py |
v4 Decision-Agent synthetic data simulator; realistic anomaly seeding |
# 1. Generate 2 seasons of synthetic training data
python main.py generate --seasons 2 --matchdays 38
# 2. Train shared backbone + calibrate per-player thresholds
python main.py train --sessions-per-player 60
# 3. Evaluate against ground truth labels (CI gate: AUC >= 0.70)
python main.py evaluate --out metrics/eval.json --min-auc 0.70
# 4. Stream live inference (NDJSON in -> NDJSON alerts out)
cat live_events.jsonl | python main.py serve
# 5. Replay historical data (interleaved multi-session streams)
cat historical_events.jsonl | python main.py serve --replay-mode
# 6. Run fairness audit + recalibration check
python main.py audit --log logs/inference_log.jsonl# Core ML & data
pip install torch scikit-learn numpy pandas scipy shap
# Ingestion adapters
pip install aiohttp websockets asyncio-mqtt pynmea2
# Database drivers
pip install sqlalchemy psycopg2-binary asyncpg
# CAG backing store
pip install redis
# Optional: progress bars
pip install tqdm
# LLM backend β install Ollama separately, then pull the model
# https://ollama.com
ollama pull qwen2.5:14bPython 3.10+ required. PyTorch CPU is sufficient for inference; GPU is recommended for training large squads.
All configuration is driven by environment variables and typed dataclasses in config/settings.py. The singleton CONFIG = PlayersDataConfig() is imported throughout the codebase.
| Variable | Default | Description |
|---|---|---|
DB_HOST |
localhost |
PostgreSQL host |
DB_PORT |
5432 |
PostgreSQL port |
DB_NAME |
players_data |
Database name |
DB_USER |
postgres |
Database user |
DB_PASSWORD |
`` | Database password |
REDIS_HOST |
localhost |
Redis host for CAG store |
REDIS_PORT |
6379 |
Redis port |
REDIS_CAG_TTL_S |
3600 |
TTL for cached SHAP and SemanticFinding entries (seconds) |
GPS_SERIAL_PORT |
/dev/ttyUSB0 |
Serial port for NMEA GPS |
GPS_TCP_HOST |
None |
TCP host for gpsd / NMEA-over-TCP |
GPS_TCP_PORT |
2947 |
TCP port for gpsd |
SPORTRADAR_API_KEY |
`` | SportRadar API key |
LIVE_WS_URL |
ws://localhost:8765 |
Live match event WebSocket URL |
MQTT_BROKER |
localhost |
MQTT broker host |
JSON_LOGS |
0 |
Set to 1 for structured JSON log output to stderr |
OLLAMA_NLG_TIMEOUT_S |
30.0 |
Timeout for Qwen2.5:14b async NLG calls (off SLA clock) |
| Dataclass | Field | Default | Notes |
|---|---|---|---|
SequenceWindowConfig |
window_seconds |
120 |
Rolling window length |
SequenceWindowConfig |
step_seconds |
15 |
Must match DT_OUT in data generator |
LSTMAutoencoderConfig |
hidden_size |
64 |
LSTM hidden units |
LSTMAutoencoderConfig |
latent_dim |
16 |
Bottleneck dimension |
LSTMAutoencoderConfig |
max_epochs |
250 |
With patience=20 early stopping |
AnomalyScoringConfig |
mad_multiplier |
5.0 |
MAD multiplier for small calibration sets (<150 windows) |
AnomalyScoringConfig |
threshold_quantile |
0.995 |
Quantile for large calibration sets (>=150 windows) |
AnomalyScoringConfig |
score_ema_alpha |
0.25 |
EMA smoothing factor for anomaly scores |
SHAPConfig |
n_background_samples |
30 |
Background samples for feature ablation |
CompressionConfig |
max_findings_per_episode |
12 |
Finding cap before episodic abstraction triggers |
CompressionConfig |
trajectory_window_steps |
5 |
Window count for trajectory narrative construction |
FeedbackConfig |
recalibration_cadence_days |
7 |
Scheduled recalibration interval |
FairnessConfig |
flag_rate_disparity_threshold |
0.15 |
Max allowed flag-rate gap between groups |
All commands log to stderr and output machine-readable JSON to stdout.
python main.py generate [OPTIONS]
Options:
--data-dir PATH Output directory for CSVs [default: data]
--seasons INT Number of seasons to simulate [default: 2]
--matchdays INT Matchdays per season [default: 38]
--anomaly-rate FLOAT Fraction of sessions with seeded anomalies [default: 0.05]
--no-corruption Skip sensor corruption layer (cleaner, faster)
--quiet Suppress per-position summary table
--log-level LEVEL DEBUG | INFO | WARNING | ERROR [default: INFO]Output: Five CSVs written to --data-dir: players.csv, sessions.csv, events.csv, annotations.csv, ground_truth_labels.csv.
Exits 1 if validation fails (zero anomalies seeded, missing columns, empty events table).
python main.py train [OPTIONS]
Options:
--data-dir PATH CSV source directory [default: data]
--model-dir PATH Checkpoint output directory [default: models]
--sessions-per-player INT Most-recent N sessions per player [default: 60]
--log-level LEVEL [default: INFO]Writes: models/shared_backbone.pt, models/train_summary.json, models/serve_state.json.
serve_state.json contains serialised per-player baselines and calibrated threshold distributions so serve can cold-start without retraining.
Exits 2 if training produces a degenerate model or the checkpoint is missing.
python main.py evaluate [OPTIONS]
Options:
--data-dir PATH CSV source directory [default: data]
--model-dir PATH Checkpoint directory [default: models]
--out PATH Metrics output (JSON) [default: metrics/eval.json]
--min-auc FLOAT CI gate: exit 3 if mean ROC-AUC below [default: 0.60]
--log-level LEVEL [default: INFO]Metrics computed per player: ROC-AUC, PR-AUC, precision@k, FP-per-90-min, TP/FP/FN/TN. Aggregated as micro (global TP/FP sums) and macro (per-player mean).
Exits 3 if mean ROC-AUC < --min-auc or no players produced evaluable windows.
Reads newline-delimited JSON events from stdin. Emits NDJSON alerts to stdout. Writes a full inference log (including non-alert windows) to logs/inference_log.jsonl.
python main.py serve [OPTIONS]
Options:
--model-dir PATH Checkpoint directory [default: models]
--min-alert-windows INT Consecutive anomalous windows before alert [default: 3]
--max-latency-ms INT SLA threshold; violations logged as WARNING [default: 200]
--ignore-time-gaps Disable time-gap buffer resets (use for batch replay)
--ignore-session-boundaries Disable session-boundary resets (use for interleaved replay)
--replay-mode Replay-safe mode; implies --ignore-time-gaps and
--ignore-session-boundaries. Also relaxes TVL timestamp
validation: reversals and large gaps produce DEGRADED
(not INVALID) so inference is not silently dropped.
--log-level LEVEL [default: INFO]SLA model: The 200 ms SLA covers inference only (LSTM forward pass + threshold comparison + state compression). LLM NLG generation runs asynchronously off the SLA clock via a thread pool with a 30 s timeout. Two latency figures are observable:
| Metric | What it covers | Where it appears |
|---|---|---|
latency_ms in alert payload |
Inference + compression (T1) | stdout NDJSON, inference log |
| Ollama call duration | Async NLG completion (T2) | Slow Ollama call WARNING in stderr |
Input event fields (NDJSON, one event per line):
| Field | Type | Required | Notes |
|---|---|---|---|
player_external_id |
str |
Yes | Must match a registered player |
ts |
str (ISO 8601) |
Yes | UTC timestamp |
match_id / session_id |
str |
β | Used for session-boundary detection |
speed_ms |
float |
Yes | Instantaneous speed in m/s |
heart_rate_bpm |
int |
Yes | BPM |
x_pitch |
float |
β | Normalised pitch X [0, 100] |
y_pitch |
float |
β | Normalised pitch Y [0, 100] |
distance_delta_m |
float |
β | Distance covered since last tick |
is_sprint |
bool |
β | True if speed >= 7.0 m/s |
elapsed_seconds |
float |
β | Seconds into session (used for fatigue enrichment) |
Alert output payload (NDJSON to stdout on alert):
{
"player_id": 7,
"external_id": "p007",
"recommendation_type": "substitute",
"confidence": 0.923,
"anomaly_score": 0.418,
"fatigue_flag": true,
"drift_flag": false,
"workload_flag": false,
"workload_status": "normal",
"nlg_summary": "Muller shows 28% speed drop and elevated HR non-recovery...",
"counterfactual": "Alert would clear if speed_ms increased by 1.2 m/s.",
"top_features": [
{"feature": "hr_recovery", "shap": 0.142, "value": -0.31, "label": "HR not recovering"},
{"feature": "speed_ms", "shap": 0.097, "value": 3.1, "label": "Below normal speed"}
],
"latency_ms": 47.3,
"ts": "2025-09-14T19:42:11Z",
"gate_windows": 4
}Recommendation priority ladder (at most one per inference cycle):
| Priority | recommendation_type |
Trigger condition |
|---|---|---|
| 1 | substitute |
Recurrent cross-match pattern + sustained persistence (β₯4 windows) + high/critical escalation |
| 2 | recovery_intervention |
Cardiovascular or recovery degradation finding, sustained (β₯3 windows), high/critical severity |
| 3 | workload_restriction |
Fatigue accumulation finding OR ACWR β₯ 1.30, sustained β₯2 windows |
| 4 | tactical_adjustment |
Tactical instability finding, any severity |
| 5 | performance_monitor |
Locomotor overload finding with worsening trend |
| 6 | anomaly_flag |
Default fallback; no specific rule matched |
python main.py audit [OPTIONS]
Options:
--log PATH Inference log path (NDJSON or JSON array) [default: logs/inference_log.jsonl]
--data-dir PATH CSV directory (for player metadata) [default: data]
--out PATH Audit report output (JSON) [default: metrics/audit.json]
--log-level LEVEL [default: INFO]Checks for flag-rate disparity across three protected attributes: position, age_group, nationality. Triggers RecalibrationPipeline if >= 10 override records are present in the log.
Exits 5 if bias is detected in any protected group (flag-rate disparity > fairness.flag_rate_disparity_threshold).
Five CSVs are produced by generate and consumed by train / evaluate:
| File | Key columns |
|---|---|
players.csv |
player_id, external_id, full_name, position, age, age_group, nationality |
sessions.csv |
session_id, player_id, started_at, ended_at |
events.csv |
session_id, ts, speed_ms, heart_rate_bpm, x_pitch, y_pitch, distance_delta_m, is_sprint, elapsed_seconds |
annotations.csv |
session_id, annotated_at, annotation_type, note |
ground_truth_labels.csv |
session_id, is_anomaly |
Eight features extracted per 15-second tick, forming 8-step (120 s) windows:
| Index | Name | Description |
|---|---|---|
| 0 | speed_ms |
Instantaneous speed (m/s) |
| 1 | accel |
Acceleration (m/sΒ²), clamped Β±10 |
| 2 | heart_rate_bpm |
HR (BPM) |
| 3 | sprint_flag |
Binary; 1 if speed >= 7.0 m/s |
| 4 | x_pitch |
Normalised pitch X [0, 100] |
| 5 | y_pitch |
Normalised pitch Y [0, 100] |
| 6 | distance_delta |
Euclidean displacement since last tick (m) |
| 7 | hr_recovery |
Fractional HR change per tick, clipped [-1, 1] |
- Architecture: Shared LSTM encoder β FiLM (Feature-wise Linear Modulation) per-player conditioning embedding β bottleneck (latent dim 16) β LSTM decoder.
- Training: All registered players jointly. Per-player embeddings are learned alongside shared weights. Per-player Β΅/Ο normalisers applied before encoding.
- Calibration split: 80% training, 20% held-out calibration per player. For large calibration sets (>=150 windows):
quantile(losses, 0.995). For small sets (<150 windows):median + 5.0 Γ MAD Γ 1.4826. - Threshold routing: At inference,
SessionRegimeClassifierlabels the window (Territory Γ Intensity β 9 possible keys). The corresponding regime tracker is used; falls back to global tracker when a regime has <5 calibration samples. - Score smoothing: EMA with Ξ±=0.25 applied to per-window reconstruction losses before threshold comparison.
Every 120-second window is classified on two axes:
| Axis | Class | Criterion |
|---|---|---|
| Territory | defensive |
mean x_pitch < 33 |
midfield |
33 <= mean x_pitch <= 67 | |
attacking |
mean x_pitch > 67 | |
| Intensity | high |
sprint fraction >= 15% of window steps |
medium |
4% <= sprint fraction < 15% | |
low |
sprint fraction < 4% |
Each regime maintains its own DynamicThresholdTracker. This distinguishes "normal high-intensity pressing" from "abnormal physiological distress" during the same match phase.
Disabled in production (CONFIG.active_model = "lstm"). Pre-LN transformer encoder with sinusoidal positional encoding and validity-weighted pooling in the bottleneck. Requires >=30 sessions per player. Enable via CONFIG.active_model = "transformer".
Fatigue Curve Comparator β Fits speed(t) = Ξ²Β·exp(βΞ±Β·t) to each player's historical speed-vs-elapsed-time data (via scipy.optimize.curve_fit). Flags when the live speed residual falls more than one personal standard deviation below the expected curve, coinciding with a model anomaly.
Positional Drift Analyzer β Computes historical GPS centroid (avg_x, avg_y) and spread (position_std_radius). Flags when the player's recent median position deviates beyond positional.zone_radius_meters (default 5.0 m) for more than positional.drift_fraction_threshold (30%) of window ticks.
Workload Trend Tracker (ACWR) β Tracks the Acute-to-Chronic Workload Ratio (7-day / 28-day rolling distance). Flags when ACWR falls outside [0.8, 1.5], the established safe training load band.
The XAI pipeline has four sequential layers with a strict separation of concerns:
Temporal Feature Ablation -> SemanticInterpreter -> MatchStateManager -> LLMNLGEngine
(attribution only) (symbolic findings) (longitudinal memory) (narration only)
Runs F + 2 = 10 model calls per inference window (one per feature zeroed out, plus baseline and full-feature). Provides channel-level attribution within the 200 ms SLA (~30β50 ms on CPU). Used as the primary attribution method in production.
shap.KernelExplainer is used when the shap library is installed and background matrix dimensions match the feature vector. The shap_compat.py magnitude-proxy fallback is used otherwise, preserving the explanation interface.
Converts raw SHAP attributions into typed SemanticFinding objects across five domains:
| Domain | Features monitored |
|---|---|
cardiovascular_load |
heart_rate_bpm, hr_recovery_time_s |
locomotor_load |
speed_ms, distance_delta, sprint_flag, z-scores |
workload_balance |
ACWR, fatigue accumulation metrics |
tactical |
x_pitch, y_pitch, positional drift |
persistence |
Longitudinal recurrence patterns |
The SemanticInterpreter is augmented by the Redis CAG layer (see Cache-Augmented Generation): before classifying current-window attributions, the interpreter retrieves cached SHAP results and prior SemanticFinding objects for the player, enabling trend-aware symbolic reasoning without recomputing past windows. The LLM receives SemanticFinding objects and acts as narrator only β physiological reasoning lives in this symbolic layer, not in the prompt.
Accumulates SemanticFinding objects over the full match timeline. Provides motif detection (repeated finding patterns within a session) and trend reasoning (increasing/decreasing severity over time). build_semantic_summary() feeds the state compression layer, which condenses the finding stream before LLM conditioning.
LLMNLGEngine calls qwen2.5:14b via Ollama asynchronously (off the SLA clock) with a OLLAMA_NLG_TIMEOUT_S timeout (default 30 s). The LLM receives a compressed state representation from the TemporalContextCompressor β not raw telemetry or the full finding stream β ensuring prompt entropy is minimised and physiological reasoning remains in the symbolic layer. On timeout or connection failure, TemplateNLGEngine provides a deterministic, sub-millisecond fallback.
NLG summary guarantee: Every emitted alert carries a non-empty nlg_summary. Alerts where SHAP is on cooldown (60 s XAI cooldown between full SHAP runs per player) receive an immediate template summary backed by cached attribution context from Redis. Alerts where SHAP runs receive the richer LLM-backed summary via the async worker.
Naively feeding the LLM a full stream of SemanticFinding objects accumulates four compounding problems as a match progresses:
- Prompt entropy β unrelated findings from different match phases dilute the signal relevant to the current alert.
- Repeated findings β the same physiological pattern (e.g.,
hr_recoverybelow baseline) may appear in every window of a sustained episode, adding tokens without adding information. - Temporal redundancy β findings from 70 minutes ago carry little diagnostic weight for a substitution decision at 85 minutes.
- Alert flooding β without compression, the LLM receives the same escalation narrative on every window of a sustained episode, producing near-identical summaries.
TemporalContextCompressor (in explainability/episodic_context.py) operates in three stages after MatchStateManager has accumulated findings for the current episode:
1. Trajectory Narrative
Constructs a structured summary of the player's physiological trajectory over the last compression.trajectory_window_steps (default 5) inference windows. Each named domain (cardiovascular_load, locomotor_load, etc.) is represented by its direction vector (stable / worsening / recovering) and peak severity, not by individual finding instances. This reduces a 5-window finding sequence to a single structured object per domain.
cardiovascular_load: worsening (peak severity: HIGH, onset: window -3)
locomotor_load: stable (severity: MEDIUM)
tactical: recovering (drift cleared at window -1)
2. Escalation Summary
Encodes the Alert FSM trajectory for the current episode as a compact descriptor:
NONE β WARNING (w=2) β SUSTAINED (w=4) β gate_windows=6. This gives the LLM the full escalation arc in a single token-efficient string, replacing per-window FSM state repetition.
3. Episodic Abstraction
When compression.max_findings_per_episode (default 12) is exceeded within a single episode_id, older findings are collapsed into a typed episode header: [EPISODE_START: cardiovascular+locomotor, onset 00:74:12, initial_confidence 0.81]. Only findings from the most recent 3 windows are passed verbatim. This preserves longitudinal behavioural structure β the LLM knows what kind of episode this is and when it started β while eliminating token-for-token repetition of resolved findings.
The compressed prompt contains:
- Trajectory narrative (domain β direction + severity): ~40β80 tokens
- Escalation summary (FSM arc): ~15 tokens
- Episodic header (if applicable): ~25 tokens
- Current-window top SHAP features (from ablation or cache): ~60 tokens
- Counterfactual (what would clear the alert): ~20 tokens
Total: ~160β200 tokens of structured context, regardless of match duration or episode length. Without compression, a 90-minute match with 5-window findings would accumulate ~2,700+ tokens of raw finding history.
The compression layer is cache-aware. Before building the trajectory narrative, TemporalContextCompressor queries RedisCheckpointStore / EpisodeStore for the player's cached SHAP attributions from the XAI cooldown period. This ensures that windows where full SHAP was not recomputed (due to the 60 s cooldown) still contribute their attribution signal to the trajectory narrative via the cached values, rather than appearing as gaps.
The system implements Cache-Augmented Generation (CAG) β as opposed to Retrieval-Augmented Generation (RAG) β using Redis as the backing store. The distinction is consequential for a real-time inference pipeline:
| CAG (this system) | RAG (not used) | |
|---|---|---|
| Retrieval | Deterministic key lookup (player_id:shap, player_id:findings) |
Approximate nearest-neighbour search |
| Latency | O(1) Redis GET / ZRANGE | Vector store query latency (5β50 ms typical) |
| Correctness | Exact cached artefacts; no retrieval error | Relevant documents may not be returned |
| Domain | Closed, structured (per-player physiological history) | Open, unstructured (general knowledge) |
For a closed, structured domain like per-player physiological findings, RAG's retrieval flexibility is unnecessary and its latency and retrieval error are unacceptable within the 200 ms SLA. CAG provides the right tradeoff.
SHAP attribution cache (player_id:shap:window_ts)
After each SHAP run, the 8-feature attribution vector is written to Redis with a REDIS_CAG_TTL_S TTL (default 3600 s). During the 60-second XAI cooldown between full SHAP runs per player, the SemanticInterpreter and TemporalContextCompressor read the most recent cached attribution rather than falling back to zero-weight attribution. This means the trajectory narrative always reflects real attribution signal, not silence.
SemanticFinding history (player_id:findings sorted set)
Each SemanticFinding emitted by the SemanticInterpreter is appended to a per-player Redis sorted set, scored by Unix timestamp. EpisodeStore retrieves the N most recent findings (default N=10) before interpreter runs, enabling trend-aware symbolic reasoning:
# Without CAG: interpreter sees only current window
findings = interpreter.classify(current_shap, current_window)
# With CAG: interpreter sees current window + longitudinal context
cached_context = cag_store.get_recent_findings(player_id, n=10)
cached_shap = cag_store.get_latest_shap(player_id)
findings = interpreter.classify(current_shap, current_window,
context=cached_context,
prior_shap=cached_shap)This is the critical enabler for multi-window trend detection β the interpreter can classify a finding as persistence (recurrent pattern) rather than first_occurrence only because the cached history is available without reprocessing the MatchStateManager trajectory.
When Redis is unavailable, RedisCheckpointStore / EpisodeStore returns empty context objects. The SemanticInterpreter falls back to single-window classification, and the TemporalContextCompressor builds trajectory narratives from in-memory MatchStateManager state only. No alerts are suppressed; the finding quality degrades gracefully from trend-aware to window-local.
REDIS_CAG_AVAILABLE=True β trend-aware findings, full trajectory narrative
REDIS_CAG_AVAILABLE=False β window-local findings, in-memory trajectory only
Telemetry validation operates in two stages:
- Pre-accumulation temporal validation (event-level) β detects timestamp reversals and epoch discontinuities before the event enters the accumulator, triggering epoch-scoped runtime resets when continuity cannot be preserved.
- Post-window semantic validation (window-level) β physical plausibility checks run after a complete window is emitted.
Pre-accumulation checks:
| Check | Live behaviour | Replay behaviour (--replay-mode) |
|---|---|---|
| Timestamp reversal | non_monotonic_timestamp β INVALID, buffer reset |
replay_non_monotonic_timestamp β DEGRADED (confidence 0.7, floored to 0.8) |
| Timestamp gap > 60 s | buffer reset | no reset (gaps expected between seasons) |
Post-window checks:
| Check | Live behaviour | Replay behaviour |
|---|---|---|
| Mask completeness | <75% of required fields β INVALID | same |
| Physical plausibility | speed >13.5 m/s (+20% margin), HR outside [30, 220], accel >12 m/sΒ² β INVALID | same |
| Timestamp gap > 5 s | timestamp_gap_* β confidence -0.3 |
replay_timestamp_gap_* β no penalty |
Replay-specific issue strings (replay_*) are distinct from live equivalents so audit queries on non_monotonic_timestamp or timestamp_gap_* continue to find only genuine sensor failures, not expected replay stream disorder.
Status values: VALID (confidence=1.0), DEGRADED (0.0β0.8), INVALID (0.0). Inference is blocked for INVALID events. DEGRADED events are inferred but flagged. The Alert FSM shifts to HOLD when event confidence <0.4.
signal_active (>= min_persistence windows)
NONE βββββββββββββββββββββββββββββββββββββββββββββΆ WARNING
β² β
β recovery (>= recovery_threshold clear windows) β signal_active (>= escalation_threshold)
β βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββ CRITICAL
βββββββββ HOLD (confidence < 0.4) ββββββββββββ
βββββββββ SAFE_MODE (system-wide) βββββββββββββ
- Hysteresis: Transitions only escalate within an episode; de-escalation requires
recovery_threshold(default 3) consecutive clear windows. - Alert family cooldown: 20 s cooldown per player. Switching alert type resets the cooldown immediately, ensuring the first instance of a new type is never suppressed.
- Episode tracking: Each
NONE β WARNINGtransition incrementsepisode_id, enabling episode-level TP/FP/FN evaluation viautils/evaluation/episodes.py.
| Level | Trigger | Features disabled |
|---|---|---|
NORMAL |
β | None |
LEVEL_1 |
SHAP/LLM violation or TVL DEGRADED |
SHAP explanation; LLM NLG |
LEVEL_2 |
Invariant violation (e.g. modelβthreshold mismatch) | Above + adaptive calibration frozen |
LEVEL_3 |
Critical invariant failure | Above + inference suspended; all alerts suppressed |
Safe Mode propagates from SystemInvariantGuard β AlertManager.set_safe_mode() β all downstream consumers.
Buffers 24 raw telemetry packets per player before emitting one inference window (non-overlapping, stride = window_size):
- 1,092 telemetry packets β ~45 inference cycles instead of 1,092
- Reduces: alert duplication from near-identical overlapping buffers, fake persistence increments on every packet, exploding trajectory lengths, motif reinforcement without new information
- Resets automatically on confirmed continuity breaks (session boundary transitions, timestamp discontinuities, epoch-scale temporal gaps)
- Buffer resets propagate through a unified epoch-reset path that atomically clears EMA state, positional trajectory buffers, alert FSM persistence state, rolling match-state trajectories, TVL per-player timestamp history, and output cooldown gates
The SLA timer (t_start) is set immediately after the accumulator emits a complete window. The 200 ms budget measures inference time β LSTM forward pass, threshold comparison, result assembly, and state compression β and is not inflated by accumulation time or asynchronous LLM NLG.
Event fingerprinting (MutationJournal): Each calibration update is content-hashed. Idempotent replay: duplicate updates are silently dropped.
Temporal Causality Guard: Detects timestamp reversals and epoch discontinuities before accumulation, triggering epoch-scoped runtime resets. Configurable strict/warn mode.
Priority-aware backpressure (BoundedPriorityQueue): Under load, tasks are shed in reverse priority β LLM summaries dropped first, then SHAP, then inference β ensuring the 200 ms SLA is preserved even when the LLM is slow or unavailable.
Replay consistency is a first-class design concern. Most sports AI systems process historical data without guaranteeing that the inference, alert, and explanation outputs produced during replay are bitwise-reproducible and semantically equivalent to what would have been produced in live operation. This system provides explicit guarantees across four layers.
The TemporalCausalityGuard enforces strict event-time monotonicity across all ingestion paths. In replay mode, timestamp reversals that are expected artefacts of interleaved multi-session streams are classified as replay_non_monotonic_timestamp (DEGRADED, confidence floored at 0.8) rather than triggering buffer resets. This preserves inference continuity through interleaved streams while keeping the live non_monotonic_timestamp marker clean for genuine sensor failure auditing.
The replay-specific issue taxonomy (replay_non_monotonic_timestamp, replay_timestamp_gap_*) is distinct from live equivalents at every layer β TVL classification, log emission, and audit query β so post-match analysis of replay logs cannot be contaminated by expected stream disorder.
In live mode, the LiveWindowAccumulator resets on session boundary transitions and timestamp gaps > 60 s. In --replay-mode, these resets are suppressed because historical streams routinely interleave events from unrelated source sessions, and session-boundary transitions in the stream do not represent genuine continuity breaks. The accumulator instead relies solely on the TemporalCausalityGuard for epoch-scoped resets, preserving the same accumulation semantics that governed alert persistence during live operation.
The unified epoch-reset path ensures that when a continuity break does occur in replay, the full runtime state is cleared atomically: EMA smoothing state, positional trajectory buffers, Alert FSM persistence counters, rolling match-state trajectories, TVL timestamp history, Redis CAG context (player findings and SHAP cache), and output cooldown gates are all reset together. Partial state resets β where the FSM clears but the EMA does not, for example β are architecturally prevented by routing all resets through a single reset_player() call chain.
The 0.8 confidence floor applied to DEGRADED replay events is propagated consistently through the full pipeline: from TelemetryValidityLayer._effective_confidence() through AlertManager (which gates on confidence < 0.4 for HOLD) and through the inference log (confidence field in every NDJSON entry). This means that post-match confidence distributions computed from the inference log accurately reflect the replay-time confidence behaviour, enabling reproducible threshold sensitivity analysis.
The --replay-mode flag is threaded from cmd_serve β _build_pipeline(replay_mode) β PlayersDataAnalysisPipeline(replay_mode) β TelemetryValidityLayer(replay_mode) β process_window_direct(replay_mode) β _effective_confidence() without duplicating policy logic at any layer.
| Behaviour | Live mode | Replay mode (--replay-mode) |
|---|---|---|
| Timestamp reversal | INVALID β buffer reset | DEGRADED (conf 0.8) β inference proceeds |
| Timestamp gap > 60 s | Buffer reset | No reset |
| Session boundary transition | Buffer reset | No reset |
| TVL issue label | non_monotonic_timestamp |
replay_non_monotonic_timestamp |
| Audit query contamination | Genuine sensor failures only | Replay disorder isolated to replay_* labels |
| Confidence floor on DEGRADED | 0.0β0.8 (unclamped) | 0.8 (floored) |
These differences are intentional and documented. They ensure replay outputs are maximally useful for post-match analysis and debugging while preserving the integrity of live sensor-failure auditing.
FairnessMonitor computes flag-rate disparity across three protected attributes:
| Attribute | Groups examined |
|---|---|
position |
GK, CB, LB, RB, CM, AM, LW, RW, ST |
age_group |
U21, Senior, Veteran |
nationality |
All unique nationalities in the squad |
A group whose flag rate deviates more than fairness.flag_rate_disparity_threshold (default 15%) from the squad mean is flagged as biased. The audit command exits with code 5 and identifies the biased groups in the output JSON.
RecalibrationPipeline runs when >=10 coach override records (OverrideRecord) have been logged for a player within the recalibration window. Adjusts per-player thresholds by feedback.threshold_adjustment_step (default Β±5%) and applies a feedback.per_player_sensitivity_decay (default 10%) to prevent runaway threshold drift. Default cadence: every 7 days.
All threshold adjustments are recorded in MutationJournal for full auditability and replay-safe reconstruction.
Human-readable (default, to stderr):
2025-09-14T19:42:11Z INFO players_data.main Serve complete | events=1092 alerts=23 sla_violations=0
Structured JSON (set JSON_LOGS=1):
{"ts": "2025-09-14T19:42:11Z", "level": "INFO", "logger": "players_data.main", "message": "ALERT player=p007 type=substitution conf=0.92 latency=47.1 ms"}Written by serve for every processed window (not just alerts). Fields: inference_id, player_id, external_id, session_id, recommendation_type, is_anomaly, anomaly_score, confidence, fatigue_flag, drift_flag, workload_flag, nlg_summary, compression_tokens, cag_hit, ts.
cag_hit: true indicates the window used Redis-cached SHAP attributions (XAI cooldown was active). compression_tokens records the token count of the compressed state passed to the LLM, enabling prompt efficiency monitoring.
When async LLM NLG completes, an enriched entry is appended with "_nlg_enrichment": true, nlg_summary_llm, and full shap_values.
| Log pattern | Meaning |
|---|---|
ALERT player=β¦ type=β¦ conf=β¦ latency=β¦ ms |
Alert emitted to stdout |
SLA breach: player=β¦ latency=β¦ms > 200ms |
Inference exceeded SLA; investigate model load |
CAG hit: player=β¦ shap_cached=True findings_cached=N |
SemanticInterpreter augmented from Redis |
CAG miss: player=β¦ redis_unavailable |
Redis down; falling back to single-window classification |
STATE COMPRESSED: player=β¦ tokens=β¦ findings_collapsed=N |
Episodic abstraction triggered; N findings collapsed to header |
BUFFER RESET reason=session_change |
LiveWindowAccumulator cleared on new session |
EPOCH RESET | player=β¦ reason=β¦ cleared=[β¦] |
Unified runtime state reset triggered by continuity break |
Telemetry degraded player=β¦ status=INVALID issues=[β¦] |
TVL rejected event; only live sensor issues appear at WARNING |
AlertManager: ENTERING GLOBAL SAFE MODE |
System-wide alert suppression active |
SHAP computation failed, using fallback |
SHAP library error; magnitude-proxy used |
Slow Ollama call: model=β¦ ms |
LLM NLG took longer than expected; alert already emitted |
circuit breaker tripped β switching to template NLG |
Ollama unavailable; template fallback active for 30 s |
Replay-specific TVL issues (replay_non_monotonic_timestamp, replay_timestamp_gap_*) are logged at DEBUG level only and do not appear in WARNING output during normal replay operation.
| Code | Command(s) | Condition |
|---|---|---|
0 |
all | Success |
1 |
generate, train, audit |
Data or validation error (missing files, empty tables, parse failure) |
2 |
train, evaluate, serve |
Model error (not trained, corrupt checkpoint, zero windows) |
3 |
evaluate |
ROC-AUC below --min-auc, or no players produced evaluable windows |
4 |
serve |
Unhandled stream error |
5 |
audit |
Bias detected in a protected attribute group |
Current limitations:
- The 200 ms SLA covers inference and state compression (T1). LLM NLG generation (T2) is asynchronous and decoupled via a 30 s timeout with deterministic template fallback. Both latencies are observable separately in logs and the inference log.
- Temporal feature ablation explains the derived feature vector, not raw LSTM hidden states. True SHAP over the full sequence space would require ~2,000 model calls per window (~2β15 s), violating the SLA.
SessionRegimeClassifieruses rule-based Territory Γ Intensity bins. Match phase (first/second half) is not included because elapsed-time context is not threaded through the calibration interface at training time.PatternAnalysisEngineis not thread-safe. One engine per asyncio event loop or per process is the supported deployment model.TransformerAutoencoderis experimental and disabled in production.- Redis CAG TTL is uniform across all artefact types. SHAP attributions and
SemanticFindingobjects have different useful lifetimes (SHAP: ~1 match; findings: ~1 session) that a tiered TTL policy would address. - Historical replay streams may interleave telemetry from unrelated source sessions. Anomaly scores in replay mode will vary across gate windows as the stream cycles through different historical sessions.
Roadmap:
- Learned GMM regime detector to replace rule-based Territory Γ Intensity bins, enabling data-driven regime discovery.
- Async
PatternAnalysisEnginewith per-player actor isolation for horizontal scaling. - SHAP over LSTM hidden states via integrated gradients (
GradientExplainer) β eliminates the sequence-space dimensionality problem. - Kafka consumer integration for multi-worker
servedeployments. - FastAPI wrapper exposing
process_window_direct()as a REST endpoint for integration with external dashboards. - Elapsed-time axis in regime classification (match phase as a third regime dimension).
- Tiered Redis TTL policy: short TTL for SHAP attributions (match-scoped), longer TTL for compressed episodic abstractions (season-scoped post-match analysis).
- Redis Streams integration for distributed exactly-once event fingerprinting across multi-worker
servedeployments.
- Rein & Memmert (2016) β Big data and tactical analysis in elite soccer; DOI: https://doi.org/10.1186/s40064-016-3108-2
- Foteinakis et al. (2025) β Explainable ML for Basketball; DOI: https://doi.org/10.3390/app152312401
- Odet et al. (2024) β ML and Explainability for Sports Outcome Prediction
- Pietraszewski et al. (2025) β AI in Sports Analytics systematic review; DOI: https://doi.org/10.3390/app15137254
- Kranzinger et al. (2025) β Explainable AI in Sports Science; DOI: https://doi.org/10.48550/arXiv.1705.07874
- Lundberg & Lee (2017) β SHAP: A Unified Approach to Interpreting Model Predictions; DOI: https://doi.org/10.48550/arXiv.1705.07874
- Hochreiter & Schmidhuber (1997) β Long Short-Term Memory
- Bai et al. (2018) β An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling; DOI: https://doi.org/10.48550/arXiv.1803.01271
- Caron & MΓΌller (2023) β TacticalGPT: Uncovering the Potential of LLMs for Predicting Tactical Decisions in Professional Football
- Ferrara (2024) β Large Language Models for Wearable Sensor-Based Human Activity Recognition; DOI: https://doi.org/10.3390/s24155045
- Yang (2024) β ChatPPG: Multi-Modal Alignment of Large Language Models for Time-Series Forecasting in Table Tennis
- Tian et al. (2025) β SportsGPT: An LLM-driven Framework for Interpretable Sports Motion Assessment and Training Guidance; DOI: https://doi.org/10.48550/arXiv.2512.14121
- Liu et al. (2024) β Smartboard: Visual Exploration of Team Tactics with LLM Agent; DOI: https://doi.org/10.1109/TVCG.2024.3456200
- Feli et al. (2025) β An LLM-Powered Agent for Physiological Data Analysis; DOI: https://doi.org/10.1109/EMBC58623.2025.11254428
- Xia et al. (2024) β SportQA: A Benchmark for Sports Understanding in Large Language Models; DOI: https://doi.org/10.18653/v1/2024.naacl-long.283
- Apostolou & Tjortjis (2019) β Sports Analytics algorithms for performance prediction; DOI: https://doi.org/10.1109/IISA.2019.8900754
- Sarlis & Tjortjis (2020) β Sports analytics β Evaluation of basketball players and team performance; DOI: https://doi.org/10.1016/j.is.2020.101562
- Ghosh et al. (2023) β Sports analytics review: AI applications, emerging technologies, and algorithmic perspective; DOI: https://doi.org/10.1002/widm.1496
- Chan et al. (2025) β Don't Do RAG: When Cache-Augmented Generation is Better than Retrieval Augmented Generation; DOI: https://doi.org/10.48550/arXiv.2412.15605
- Lewis et al. (2020) β Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks; DOI: https://doi.org/10.48550/arXiv.2005.11401
- Perez et al. (2018) β FiLM: Visual Reasoning with a General Conditioning Layer; AAAI 2018
- Gabbett (2016) β The training-injury prevention paradox; DOI: https://doi.org/10.1136/bjsports-2015-095788