Skip to content

Commit a573ad0

Browse files
authored
feat(auth): replace static hotkey/API-key auth with Bittensor validator whitelisting and 50% consensus (#5)
* feat(auth): replace static hotkey/API-key auth with Bittensor validator whitelisting and 50% consensus Integrate dynamic validator whitelisting from Bittensor netuid 100 and consensus-based evaluation triggering, replacing the previous single AUTHORIZED_HOTKEY + WORKER_API_KEY authentication system. Authentication now uses a dynamic whitelist of validators fetched every 5 minutes from the Bittensor blockchain via bittensor-rs. Validators must have validator_permit, be active, and have >=10,000 TAO stake. POST /submit requests only trigger evaluations when >=50% of whitelisted validators submit the same archive payload (identified by SHA-256 hash). New modules: - src/validator_whitelist.rs: ValidatorWhitelist with parking_lot::RwLock, background refresh loop with 3-retry exponential backoff, connection resilience (keeps cached whitelist on failure), starts empty and rejects requests with 503 until first successful sync - src/consensus.rs: ConsensusManager using DashMap for lock-free vote tracking, PendingConsensus entries with TTL (default 60s), reaper loop every 30s, max 100 pending entries cap, duplicate vote detection Modified modules: - src/auth.rs: Removed AUTHORIZED_HOTKEY import, api_key field from AuthHeaders, X-Api-Key header extraction, InvalidApiKey error variant. verify_request() now takes &ValidatorWhitelist instead of API key string. Updated all tests accordingly. - src/config.rs: Removed AUTHORIZED_HOTKEY constant and worker_api_key field. Added bittensor_netuid, min_validator_stake_tao, validator_refresh_secs, consensus_threshold, consensus_ttl_secs with env var support and sensible defaults. Updated banner output. - src/handlers.rs: Added ValidatorWhitelist and ConsensusManager to AppState. submit_batch now: checks whitelist non-empty (503), validates against whitelist, computes SHA-256 of archive, records consensus vote, returns 202 with pending status or triggers evaluation on consensus. Moved active batch check to consensus-reached branch only. - src/main.rs: Added module declarations, creates ValidatorWhitelist and ConsensusManager, spawns background refresh and reaper tasks. - Cargo.toml: Added bittensor-rs git dependency and mandatory [patch.crates-io] for w3f-bls. - Dockerfile: Added protobuf-compiler, cmake, clang, mold build deps for bittensor-rs substrate dependencies. Copies .cargo config. - AGENTS.md and src/AGENTS.md: Updated data flow, module map, env vars, authentication docs to reflect new architecture. BREAKING CHANGE: WORKER_API_KEY env var and X-Api-Key header no longer required. All validators on Bittensor netuid 100 with sufficient stake are auto-whitelisted. * ci: trigger CI run * fix(security): address auth bypass, input validation, and config issues - Move nonce consumption AFTER signature verification in verify_request() to prevent attackers from burning legitimate nonces via invalid signatures - Fix TOCTOU race in NonceStore::check_and_insert() using atomic DashMap entry API instead of separate contains_key + insert - Add input length limits for auth headers (hotkey 128B, nonce 256B, signature 256B) to prevent memory exhaustion via oversized values - Add consensus_threshold validation in Config::from_env() — must be in range (0.0, 1.0], panics at startup if invalid - Add saturating conversion for consensus required calculation to prevent integer overflow on f64→usize cast - Add tests for all security fixes * fix(dead-code): remove orphaned default_concurrent fn and unnecessary allow(dead_code) * fix: code quality issues in bittensor validator consensus - Extract magic number 100 to configurable MAX_PENDING_CONSENSUS - Restore #[allow(dead_code)] on DEFAULT_MAX_OUTPUT_BYTES constant - Use anyhow::Context instead of map_err(anyhow::anyhow!) in validator_whitelist * fix(security): address race condition, config panic, SS58 checksum, and container security - consensus.rs: Fix TOCTOU race condition in record_vote by using DashMap entry API (remove_entry) to atomically check votes and remove entry while holding the shard lock, preventing concurrent threads from inserting votes between drop and remove - config.rs: Replace assert! with proper Result<Self, String> return from Config::from_env() to avoid panicking in production on invalid CONSENSUS_THRESHOLD values - main.rs: Update Config::from_env() call to handle Result with expect - auth.rs: Add SS58 checksum verification using Blake2b-512 (correct Substrate algorithm) in ss58_to_public_key_bytes to reject addresses with corrupted checksums; previously only decoded base58 without validating the 2-byte checksum suffix - Dockerfile: Add non-root executor user for container runtime security * fix(dead-code): remove unused max_output_bytes config field and constant Remove DEFAULT_MAX_OUTPUT_BYTES constant and max_output_bytes Config field that were defined and populated from env but never read anywhere outside config.rs. Both had #[allow(dead_code)] annotations suppressing warnings. * fix(quality): replace expect/unwrap with proper error handling, extract magic numbers to constants - main.rs: Replace .expect() on Config::from_env() with match + tracing::error! + process::exit(1) - validator_whitelist.rs: Extract retry count (3) and backoff base (2) to named constants - validator_whitelist.rs: Replace unwrap_or_else on Option with if-let pattern - consensus.rs: Extract reaper interval (30s) to REAPER_INTERVAL_SECS constant * fix(security): address multiple security vulnerabilities in PR files - consensus.rs: Remove archive_data storage from PendingConsensus to prevent memory exhaustion (up to 50GB with 100 pending × 500MB each). Callers now use their own archive bytes since all votes for the same hash have identical data. - handlers.rs: Stream multipart upload with per-chunk size enforcement instead of buffering entire archive before checking size limit. Sanitize error messages to not leak internal details (file paths, extraction errors) to clients; log details server-side instead. - auth.rs: Add nonce format validation requiring non-empty printable ASCII characters (defense-in-depth against log injection and empty nonce edge cases). - main.rs: Replace .unwrap() on TcpListener::bind and axum::serve with proper error logging and process::exit per AGENTS.md rules. - ws.rs: Replace .unwrap() on serde_json::to_string with unwrap_or_default() to comply with AGENTS.md no-unwrap rule. * fix(dead-code): rename misleading underscore-prefixed variable in consensus * fix(quality): replace unwrap/expect with proper error handling in production code - main.rs:21: Replace .parse().unwrap() on tracing directive with unwrap_or_else fallback to INFO level directive - main.rs:36: Replace .expect() on workspace dir creation with error log + process::exit(1) pattern - main.rs:110: Replace .expect() on ctrl_c handler with if-let-Err that logs and returns gracefully - executor.rs:189: Replace semaphore.acquire().unwrap() with match that handles closed semaphore by creating a failed TaskResult All changes follow AGENTS.md rule: no .unwrap()/.expect() in production code paths. Test code is unchanged. * docs: refresh AGENTS.md
1 parent 873cfbf commit a573ad0

File tree

13 files changed

+6334
-1293
lines changed

13 files changed

+6334
-1293
lines changed

AGENTS.md

Lines changed: 52 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -11,29 +11,51 @@ This is a **single-crate Rust binary** (`term-executor`) built with Axum. There
1111
### Data Flow
1212

1313
```
14-
Client → POST /submit (multipart archive) → term-executor
15-
1. Authenticate via X-Hotkey, X-Nonce, X-Signature, X-Api-Key headers
16-
2. Extract uploaded archive (zip/tar.gz) containing tasks/ and agent_code/
17-
3. Parse each task: workspace.yaml, prompt.md, tests/
18-
4. For each task (concurrently, up to limit):
19-
a. git clone the target repository at base_commit
20-
b. Run install commands (pip install, etc.)
21-
c. Write & execute agent code in the repo
22-
d. Write test source files into the repo
23-
e. Run test scripts (bash), collect exit codes
24-
5. Aggregate results (reward per task, aggregate reward)
25-
6. Stream progress via WebSocket (GET /ws?batch_id=...)
26-
7. Return results via GET /batch/{id}
14+
Validator → POST /submit (multipart archive) → term-executor
15+
1. Authenticate via X-Hotkey, X-Nonce, X-Signature headers
16+
2. Verify hotkey is in the dynamic validator whitelist (Bittensor netuid 100, >10k TAO stake)
17+
3. Compute SHA-256 hash of archive bytes
18+
4. Record vote in ConsensusManager
19+
5. If <50% of whitelisted validators have voted for this hash:
20+
→ Return 202 Accepted with pending_consensus status
21+
6. If ≥50% consensus reached:
22+
a. Extract uploaded archive (zip/tar.gz) containing tasks/ and agent_code/
23+
b. Parse each task: workspace.yaml, prompt.md, tests/
24+
c. For each task (concurrently, up to limit):
25+
i. git clone the target repository at base_commit
26+
ii. Run install commands (pip install, etc.)
27+
iii. Write & execute agent code in the repo
28+
iv. Write test source files into the repo
29+
v. Run test scripts (bash), collect exit codes
30+
d. Aggregate results (reward per task, aggregate reward)
31+
e. Stream progress via WebSocket (GET /ws?batch_id=...)
32+
f. Return results via GET /batch/{id}
33+
```
34+
35+
### Background Tasks
36+
37+
```
38+
ValidatorWhitelist refresh loop (every 5 minutes):
39+
1. Connect to Bittensor subtensor via BittensorClient::with_failover()
40+
2. Sync metagraph for netuid 100
41+
3. Filter validators: validator_permit && active && stake >= 10,000 TAO
42+
4. Atomically replace whitelist with new set of SS58 hotkeys
43+
5. On failure: retry up to 3 times with exponential backoff, keep cached whitelist
44+
45+
ConsensusManager reaper loop (every 30 seconds):
46+
1. Remove pending consensus entries older than TTL (default 60s)
2747
```
2848

2949
### Module Map
3050

3151
| File | Responsibility |
3252
|---|---|
33-
| `src/main.rs` | Entry point — bootstraps config, session manager, executor, Axum server, reaper tasks |
34-
| `src/config.rs` | `Config` struct loaded from environment variables with defaults; `AUTHORIZED_HOTKEY` constant |
53+
| `src/main.rs` | Entry point — bootstraps config, session manager, executor, validator whitelist, consensus manager, Axum server, background tasks |
54+
| `src/config.rs` | `Config` struct loaded from environment variables with defaults; Bittensor and consensus configuration |
3555
| `src/handlers.rs` | Axum route handlers: `/health`, `/status`, `/metrics`, `/submit`, `/batch/{id}`, `/batch/{id}/tasks`, `/batch/{id}/task/{task_id}`, `/batches` |
36-
| `src/auth.rs` | Authentication: `extract_auth_headers()`, `verify_request()`, `validate_ss58()`, sr25519 signature verification via `verify_sr25519_signature()`, `NonceStore` for replay protection, `AuthHeaders`/`AuthError` types |
56+
| `src/auth.rs` | Authentication: `extract_auth_headers()`, `verify_request()` (whitelist-based), `validate_ss58()`, sr25519 signature verification via `verify_sr25519_signature()`, SS58 checksum via `blake2`, `NonceStore` for replay protection, `AuthHeaders`/`AuthError` types |
57+
| `src/validator_whitelist.rs` | Dynamic validator whitelist — fetches validators from Bittensor netuid 100 every 5 minutes, filters by stake ≥10k TAO, stores SS58 hotkeys in `parking_lot::RwLock<HashSet>` |
58+
| `src/consensus.rs` | 50% consensus manager — tracks pending votes per archive hash in `DashMap`, triggers evaluation when ≥50% of whitelisted validators submit same payload, TTL reaper for expired entries |
3759
| `src/executor.rs` | Core evaluation engine — spawns batch tasks that clone repos, run agents, run tests concurrently |
3860
| `src/session.rs` | `SessionManager` with `DashMap`, `Batch`, `BatchResult`, `TaskResult`, `BatchStatus`, `TaskStatus`, `WsEvent` types |
3961
| `src/task.rs` | Archive extraction (zip/tar.gz), task directory parsing, agent code loading, language detection |
@@ -43,8 +65,10 @@ Client → POST /submit (multipart archive) → term-executor
4365

4466
### Key Shared State (via `Arc`)
4567

46-
- `AppState` (in `handlers.rs`) holds `Config`, `SessionManager`, `Metrics`, `Executor`, `NonceStore`, `started_at`
68+
- `AppState` (in `handlers.rs`) holds `Config`, `SessionManager`, `Metrics`, `Executor`, `NonceStore`, `started_at`, `ValidatorWhitelist`, `ConsensusManager`
4769
- `SessionManager` uses `DashMap<String, Arc<Batch>>` for lock-free concurrent access
70+
- `ValidatorWhitelist` uses `parking_lot::RwLock<HashSet<String>>` for concurrent read access with rare writes
71+
- `ConsensusManager` uses `DashMap<String, PendingConsensus>` for lock-free concurrent vote tracking
4872
- Per-batch `Semaphore` in `executor.rs` controls concurrent tasks within a batch (configurable, default: 8)
4973
- `broadcast::Sender<WsEvent>` per batch for WebSocket event streaming
5074

@@ -59,7 +83,8 @@ Client → POST /submit (multipart archive) → term-executor
5983
- **Archive Handling**: `flate2` + `tar` (tar.gz), `zip` 2 (zip)
6084
- **Error Handling**: `anyhow` 1 + `thiserror` 2
6185
- **Logging**: `tracing` + `tracing-subscriber` with env-filter
62-
- **Crypto/Identity**: `sha2`, `hex`, `base64`, `bs58` (SS58 address validation), `schnorrkel` 0.11 (sr25519 signature verification), `rand_core` 0.6, `uuid` v4
86+
- **Crypto/Identity**: `sha2`, `hex`, `base64`, `bs58` (SS58 address validation), `schnorrkel` 0.11 (sr25519 signature verification), `blake2` 0.10 (SS58 checksum), `rand_core` 0.6, `uuid` v4
87+
- **Blockchain**: `bittensor-rs` (git dependency) for Bittensor validator whitelisting via subtensor RPC
6388
- **Time**: `chrono` with serde support
6489
- **Build Tooling**: `mold` linker via `.cargo/config.toml`, `clang` as linker driver
6590
- **Container**: Multi-stage Dockerfile — `rust:1.93-slim-bookworm` builder → `debian:bookworm-slim` runtime (includes python3, pip, venv, build-essential, git, curl)
@@ -71,13 +96,13 @@ Client → POST /submit (multipart archive) → term-executor
7196

7297
2. **All clippy warnings are errors.** Run `cargo +nightly clippy --all-targets -- -D warnings` locally. CI runs the same command and will fail on any warning.
7398

74-
3. **Never expose secrets in logs or responses.** The `AUTHORIZED_HOTKEY` in `src/config.rs` is the only authorized SS58 hotkey. Auth failures log only the rejection, never the submitted hotkey value. Follow this pattern for any new secrets.
99+
3. **Never expose secrets in logs or responses.** Auth failures log only the rejection, never the submitted hotkey value. Follow this pattern for any new secrets.
75100

76101
4. **All process execution MUST have timeouts.** Every call to `run_cmd`/`run_shell` in `src/executor.rs` takes a `Duration` timeout. Never spawn a child process without a timeout — agent code is untrusted and may hang forever.
77102

78103
5. **Output MUST be truncated.** The `truncate_output()` function in `src/executor.rs` caps output at `MAX_OUTPUT` (1MB). Any new command output capture must use this function to prevent memory exhaustion from malicious agent output.
79104

80-
6. **Shared state must use `Arc` + lock-free structures.** `SessionManager` uses `DashMap` (not `Mutex<HashMap>`). Metrics use `AtomicU64`. New shared state should follow these patterns — never use `std::sync::Mutex` for hot-path data.
105+
6. **Shared state must use `Arc` + lock-free structures.** `SessionManager` uses `DashMap` (not `Mutex<HashMap>`). Metrics use `AtomicU64`. `ValidatorWhitelist` uses `parking_lot::RwLock`. `ConsensusManager` uses `DashMap`. New shared state should follow these patterns — never use `std::sync::Mutex` for hot-path data.
81106

82107
7. **Semaphore must gate task concurrency.** The per-batch `Semaphore` in `executor.rs` limits concurrent tasks within a batch. The `SessionManager::has_active_batch()` check prevents multiple batches from running simultaneously.
83108

@@ -162,10 +187,14 @@ Both hooks are activated via `git config core.hooksPath .githooks`.
162187
| `AGENT_TIMEOUT_SECS` | `600` | Agent execution timeout |
163188
| `TEST_TIMEOUT_SECS` | `300` | Test suite timeout |
164189
| `MAX_ARCHIVE_BYTES` | `524288000` | Max uploaded archive size (500MB) |
165-
| `MAX_OUTPUT_BYTES` | `1048576` | Max captured output per command (1MB) |
166190
| `WORKSPACE_BASE` | `/tmp/sessions` | Base directory for session workspaces |
167-
| `WORKER_API_KEY` | *(required)* | API key that whitelisted hotkeys must provide via `X-Api-Key` header |
191+
| `BITTENSOR_NETUID` | `100` | Bittensor subnet ID for validator lookup |
192+
| `MIN_VALIDATOR_STAKE_TAO` | `10000` | Minimum TAO stake for validator whitelisting |
193+
| `VALIDATOR_REFRESH_SECS` | `300` | Interval for refreshing validator whitelist (seconds) |
194+
| `CONSENSUS_THRESHOLD` | `0.5` | Fraction of validators required for consensus (0.0–1.0) |
195+
| `CONSENSUS_TTL_SECS` | `60` | TTL for pending consensus entries (seconds) |
196+
| `MAX_PENDING_CONSENSUS` | `100` | Maximum number of pending consensus entries |
168197

169198
## Authentication
170199

171-
Authentication requires four HTTP headers: `X-Hotkey` (SS58 address), `X-Nonce` (unique per-request), `X-Signature` (sr25519 hex signature of `hotkey + nonce`), and `X-Api-Key`. The authorized hotkey is hardcoded as `AUTHORIZED_HOTKEY` in `src/config.rs`. The API key is configured via the `WORKER_API_KEY` environment variable (required). Verification steps: hotkey must match `AUTHORIZED_HOTKEY`, API key must match, SS58 format must be valid, nonce must not have been seen before (replay protection via `NonceStore` in `src/auth.rs` with 5-minute TTL), and the sr25519 signature must verify against the hotkey's public key using the Substrate signing context. Only requests passing all checks can submit batches via `POST /submit`. All other endpoints are open.
200+
Authentication requires three HTTP headers: `X-Hotkey` (SS58 address), `X-Nonce` (unique per-request), and `X-Signature` (sr25519 hex signature of `hotkey + nonce`). The authorized hotkeys are dynamically loaded from the Bittensor blockchain — all validators on netuid 100 with ≥10,000 TAO stake and an active validator permit are whitelisted. The whitelist refreshes every 5 minutes. Verification steps (in order): hotkey must be in the validator whitelist, SS58 format must be valid, sr25519 signature must verify against the hotkey's public key using the Substrate signing context, and finally the nonce must not have been seen before (replay protection via `NonceStore` in `src/auth.rs` with 5-minute TTL — nonce is only consumed after signature passes). Only requests passing all checks can submit batches via `POST /submit`. Evaluations are only triggered when ≥50% of whitelisted validators have submitted the same archive payload (identified by SHA-256 hash). All other endpoints are open.

0 commit comments

Comments
 (0)