diff --git a/.kiro/specs/proxy-system-hardening/design.md b/.kiro/specs/proxy-system-hardening/design.md new file mode 100644 index 000000000..1bbe2553d --- /dev/null +++ b/.kiro/specs/proxy-system-hardening/design.md @@ -0,0 +1,36 @@ +# Design: Proxy System Hardening + +## Approach (Start Small) +- Keep existing ProxyInjector/ProxyPool; add minimal extensions for health, retry, auth, metrics. +- Single retry on failure to avoid complexity; no circuit breaker. +- Health checker: periodic async task owned by ProxyInjector; uses TCP/HTTP ping via TCPHealthChecker; marks (un)healthy in ProxyPool. +- Config loading: implement env/config/explicit precedence in `load_proxy_settings` + FeedHandler normalization; validate non-empty proxies when enabled. + +## Components & Changes +1) Config Loader & Validation +- `load_proxy_settings`: build ProxySettings from env; keep existing pydantic env support; add fallback to YAML/default; merge precedence. +- Validation: raise if enabled with zero proxies; validate URLs and strategy values; support username/password on ProxyUrlConfig and propagate. + +2) Health Checker +- New `ProxyHealthService` in `proxy.py` started when settings.health.enabled. +- Runs every `interval_seconds` (default 30s, min 5s, max 300s): for each ProxyPool proxy, run TCPHealthChecker (timeout/retry from HealthCheckConfig); mark unhealthy/healthy accordingly. +- Expose hooks for tests to inject a stub health checker and force single-pass execution. + +3) Retry/Fallback on Failure +- HTTP: on session creation failure, mark proxy unhealthy, release, retry once with another proxy (if available); structured log + metric; fallback preserves legacy proxy kwarg when proxy system disabled. +- WS: on connect failure after leasing, mark unhealthy, release, retry once with another proxy; ensure release in finally; error with guidance if no alternative proxy. + +4) Observability +- Structured logs: lease/select/release, retry, health result (transport, exchange, proxy_url, status, reason). +- Metrics (minimal counters/gauges): `proxy_leases_total`, `proxy_lease_failures_total`, `proxy_retries_total`, `proxy_unhealthy_gauge`, `proxy_health_success_total`, `proxy_health_failure_total`. + +5) Auth & Scheme Validation +- ProxyUrlConfig: add username/password optional; support http/https/socks4/socks4a/socks5/socks5h; error on unknown scheme with guidance. + +## Testing Strategy +- Unit: precedence merge; validation errors; auth propagation; pool unhealthy fallback selection; retry-on-failure picks different proxy. +- Integration (mocked sockets): HTTP session with proxy + retry; WS connect with python-socks mocked; health checker marks/unmarks. +- Metrics/log assertions: verify counters/log entries emitted on lease/retry/health. + +## Out of Scope +- External proxy manager, advanced circuit breaker, multi-region HA, GUI/CLI. diff --git a/.kiro/specs/proxy-system-hardening/requirements.md b/.kiro/specs/proxy-system-hardening/requirements.md new file mode 100644 index 000000000..a20191c7e --- /dev/null +++ b/.kiro/specs/proxy-system-hardening/requirements.md @@ -0,0 +1,36 @@ +# Requirements: Proxy System Hardening (Spec: proxy-system-hardening) + +## Goals +- Reliable proxy selection with clear precedence (env > explicit argument > config YAML) and validation. +- Resilient leasing with health tracking, retry/fallback, and safe release for HTTP/WS. +- Observability: structured logs and minimal metrics for selection, failures, health, and retries. +- Keep KISS/Start-Small: single retry, simple health checker, no external services. + +## Scope +- IN: proxy settings loading/validation; ProxyInjector/ProxyPool behavior; HTTP/WebSocket connection usage; health checking; logging/metrics. +- OUT: external proxy service, HA control planes, advanced circuit breakers, GUI/CLI tools. + +## Functional Requirements +1. **Config Precedence**: proxy settings load from env (`CRYPTOFEED_PROXY_*`), then config YAML (`proxy` key), then explicit `proxy_settings` passed to FeedHandler; later sources override earlier. +2. **Validation**: when `proxy.enabled` is true, at least one enabled proxy (default or exchange-specific) must exist; pool strategies must be supported; invalid URLs raise clear errors. +3. **Auth Support**: `ProxyUrlConfig` supports optional username/password and propagates to HTTP (aiohttp) and WS (python-socks) connectors. +4. **Health Checks**: optional periodic health checker marks proxies unhealthy/healthy using `HealthCheckConfig`; integrates with pools. +5. **Retry/Fallback**: on connection failure, mark proxy unhealthy and retry once with a different proxy (if available) for HTTP and WS flows; always release leases on failure. +6. **Observability**: structured logs for lease/select/release, failures, retries, health results; metrics counters for leases, lease_failures, retries, unhealthy_count, health_success/fail. +7. **WS Compatibility**: HTTP/SOCKS proxies validated; unknown schemes error with guidance; python-socks missing → actionable ImportError. + +## Non-Functional Requirements +- Minimal overhead: health interval configurable; default healthy path unchanged. +- Backward compatible: legacy `proxy` kwarg on connections continues to work when proxy system disabled. +- Testable: unit/integration coverage for precedence, retry, health, and metrics/logs. + +## Success Criteria +- Env > config > explicit precedence proven by tests. +- At least one failing-proxy retry succeeds or surfaces clear error; unhealthy list updated. +- Metrics/log lines emitted for lease/retry/health paths. +- Health checker can mark unhealthy and recover to healthy after passing checks. + +## Dependencies +- Existing proxy system modules (`proxy.py`, `proxy_config.py`, `proxy_pool.py`), feedhandler initialization, connection HTTP/WS paths. +- python-socks / aiohttp dependencies already present. + diff --git a/.kiro/specs/proxy-system-hardening/spec.json b/.kiro/specs/proxy-system-hardening/spec.json new file mode 100644 index 000000000..0461dc6f8 --- /dev/null +++ b/.kiro/specs/proxy-system-hardening/spec.json @@ -0,0 +1,14 @@ +{ + "name": "proxy-system-hardening", + "version": "0.1.0", + "status": "init", + "created": "2025-11-29", + "updated": "2025-11-29", + "description": "Hardening of proxy system: config precedence, health checks, retry/failover, observability", + "language": "en", + "approvals": { + "requirements": {"generated": false, "approved": false}, + "design": {"generated": false, "approved": false}, + "tasks": {"generated": false, "approved": false} + } +} diff --git a/.kiro/specs/proxy-system-hardening/tasks.md b/.kiro/specs/proxy-system-hardening/tasks.md new file mode 100644 index 000000000..b7bbada89 --- /dev/null +++ b/.kiro/specs/proxy-system-hardening/tasks.md @@ -0,0 +1,17 @@ +# Tasks: Proxy System Hardening + +## Phase 1: Config & Validation +1. Implement env/config/explicit precedence in `load_proxy_settings` and FeedHandler init; add tests for precedence and non-empty proxies when enabled. +2. Extend `ProxyUrlConfig` with username/password and scheme validation; update HTTP/WS paths to use credentials. + +## Phase 2: Health Checking +3. Add `ProxyHealthService` periodic checker using `HealthCheckConfig`; mark unhealthy/healthy in ProxyPool; tests with stub checker. + +## Phase 3: Retry & Observability +4. Add single-retry-on-failure for HTTP session creation with unhealthy marking/release; structured logs + metrics. +5. Add single-retry-on-failure for WS connect with unhealthy marking/release; structured logs + metrics. +6. Add metrics counters/gauges (leases, lease_failures, retries, unhealthy, health successes/failures) and minimal log formats; tests asserting emissions. + +## Phase 4: Documentation +7. Update proxy docs with config examples (env/YAML/explicit), auth, health, retry behavior, metrics names. +