Commit a573ad0
authored
feat(auth): replace static hotkey/API-key auth with Bittensor validator whitelisting and 50% consensus (#5)
* feat(auth): replace static hotkey/API-key auth with Bittensor validator whitelisting and 50% consensus
Integrate dynamic validator whitelisting from Bittensor netuid 100 and
consensus-based evaluation triggering, replacing the previous single
AUTHORIZED_HOTKEY + WORKER_API_KEY authentication system.
Authentication now uses a dynamic whitelist of validators fetched every
5 minutes from the Bittensor blockchain via bittensor-rs. Validators
must have validator_permit, be active, and have >=10,000 TAO stake.
POST /submit requests only trigger evaluations when >=50% of whitelisted
validators submit the same archive payload (identified by SHA-256 hash).
New modules:
- src/validator_whitelist.rs: ValidatorWhitelist with parking_lot::RwLock,
background refresh loop with 3-retry exponential backoff, connection
resilience (keeps cached whitelist on failure), starts empty and rejects
requests with 503 until first successful sync
- src/consensus.rs: ConsensusManager using DashMap for lock-free vote
tracking, PendingConsensus entries with TTL (default 60s), reaper loop
every 30s, max 100 pending entries cap, duplicate vote detection
Modified modules:
- src/auth.rs: Removed AUTHORIZED_HOTKEY import, api_key field from
AuthHeaders, X-Api-Key header extraction, InvalidApiKey error variant.
verify_request() now takes &ValidatorWhitelist instead of API key string.
Updated all tests accordingly.
- src/config.rs: Removed AUTHORIZED_HOTKEY constant and worker_api_key
field. Added bittensor_netuid, min_validator_stake_tao,
validator_refresh_secs, consensus_threshold, consensus_ttl_secs with
env var support and sensible defaults. Updated banner output.
- src/handlers.rs: Added ValidatorWhitelist and ConsensusManager to
AppState. submit_batch now: checks whitelist non-empty (503), validates
against whitelist, computes SHA-256 of archive, records consensus vote,
returns 202 with pending status or triggers evaluation on consensus.
Moved active batch check to consensus-reached branch only.
- src/main.rs: Added module declarations, creates ValidatorWhitelist and
ConsensusManager, spawns background refresh and reaper tasks.
- Cargo.toml: Added bittensor-rs git dependency and mandatory
[patch.crates-io] for w3f-bls.
- Dockerfile: Added protobuf-compiler, cmake, clang, mold build deps
for bittensor-rs substrate dependencies. Copies .cargo config.
- AGENTS.md and src/AGENTS.md: Updated data flow, module map, env vars,
authentication docs to reflect new architecture.
BREAKING CHANGE: WORKER_API_KEY env var and X-Api-Key header no longer required.
All validators on Bittensor netuid 100 with sufficient stake are auto-whitelisted.
* ci: trigger CI run
* fix(security): address auth bypass, input validation, and config issues
- Move nonce consumption AFTER signature verification in verify_request()
to prevent attackers from burning legitimate nonces via invalid signatures
- Fix TOCTOU race in NonceStore::check_and_insert() using atomic DashMap
entry API instead of separate contains_key + insert
- Add input length limits for auth headers (hotkey 128B, nonce 256B,
signature 256B) to prevent memory exhaustion via oversized values
- Add consensus_threshold validation in Config::from_env() — must be
in range (0.0, 1.0], panics at startup if invalid
- Add saturating conversion for consensus required calculation to prevent
integer overflow on f64→usize cast
- Add tests for all security fixes
* fix(dead-code): remove orphaned default_concurrent fn and unnecessary allow(dead_code)
* fix: code quality issues in bittensor validator consensus
- Extract magic number 100 to configurable MAX_PENDING_CONSENSUS
- Restore #[allow(dead_code)] on DEFAULT_MAX_OUTPUT_BYTES constant
- Use anyhow::Context instead of map_err(anyhow::anyhow!) in validator_whitelist
* fix(security): address race condition, config panic, SS58 checksum, and container security
- consensus.rs: Fix TOCTOU race condition in record_vote by using
DashMap entry API (remove_entry) to atomically check votes and remove
entry while holding the shard lock, preventing concurrent threads from
inserting votes between drop and remove
- config.rs: Replace assert! with proper Result<Self, String> return
from Config::from_env() to avoid panicking in production on invalid
CONSENSUS_THRESHOLD values
- main.rs: Update Config::from_env() call to handle Result with expect
- auth.rs: Add SS58 checksum verification using Blake2b-512 (correct
Substrate algorithm) in ss58_to_public_key_bytes to reject addresses
with corrupted checksums; previously only decoded base58 without
validating the 2-byte checksum suffix
- Dockerfile: Add non-root executor user for container runtime security
* fix(dead-code): remove unused max_output_bytes config field and constant
Remove DEFAULT_MAX_OUTPUT_BYTES constant and max_output_bytes Config field
that were defined and populated from env but never read anywhere outside
config.rs. Both had #[allow(dead_code)] annotations suppressing warnings.
* fix(quality): replace expect/unwrap with proper error handling, extract magic numbers to constants
- main.rs: Replace .expect() on Config::from_env() with match + tracing::error! + process::exit(1)
- validator_whitelist.rs: Extract retry count (3) and backoff base (2) to named constants
- validator_whitelist.rs: Replace unwrap_or_else on Option with if-let pattern
- consensus.rs: Extract reaper interval (30s) to REAPER_INTERVAL_SECS constant
* fix(security): address multiple security vulnerabilities in PR files
- consensus.rs: Remove archive_data storage from PendingConsensus to
prevent memory exhaustion (up to 50GB with 100 pending × 500MB each).
Callers now use their own archive bytes since all votes for the same
hash have identical data.
- handlers.rs: Stream multipart upload with per-chunk size enforcement
instead of buffering entire archive before checking size limit.
Sanitize error messages to not leak internal details (file paths,
extraction errors) to clients; log details server-side instead.
- auth.rs: Add nonce format validation requiring non-empty printable
ASCII characters (defense-in-depth against log injection and empty
nonce edge cases).
- main.rs: Replace .unwrap() on TcpListener::bind and axum::serve with
proper error logging and process::exit per AGENTS.md rules.
- ws.rs: Replace .unwrap() on serde_json::to_string with
unwrap_or_default() to comply with AGENTS.md no-unwrap rule.
* fix(dead-code): rename misleading underscore-prefixed variable in consensus
* fix(quality): replace unwrap/expect with proper error handling in production code
- main.rs:21: Replace .parse().unwrap() on tracing directive with
unwrap_or_else fallback to INFO level directive
- main.rs:36: Replace .expect() on workspace dir creation with
error log + process::exit(1) pattern
- main.rs:110: Replace .expect() on ctrl_c handler with if-let-Err
that logs and returns gracefully
- executor.rs:189: Replace semaphore.acquire().unwrap() with match
that handles closed semaphore by creating a failed TaskResult
All changes follow AGENTS.md rule: no .unwrap()/.expect() in
production code paths. Test code is unchanged.
* docs: refresh AGENTS.md1 parent 873cfbf commit a573ad0
File tree
13 files changed
+6334
-1293
lines changed- src
13 files changed
+6334
-1293
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
27 | 47 | | |
28 | 48 | | |
29 | 49 | | |
30 | 50 | | |
31 | 51 | | |
32 | 52 | | |
33 | | - | |
34 | | - | |
| 53 | + | |
| 54 | + | |
35 | 55 | | |
36 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
37 | 59 | | |
38 | 60 | | |
39 | 61 | | |
| |||
43 | 65 | | |
44 | 66 | | |
45 | 67 | | |
46 | | - | |
| 68 | + | |
47 | 69 | | |
| 70 | + | |
| 71 | + | |
48 | 72 | | |
49 | 73 | | |
50 | 74 | | |
| |||
59 | 83 | | |
60 | 84 | | |
61 | 85 | | |
62 | | - | |
| 86 | + | |
| 87 | + | |
63 | 88 | | |
64 | 89 | | |
65 | 90 | | |
| |||
71 | 96 | | |
72 | 97 | | |
73 | 98 | | |
74 | | - | |
| 99 | + | |
75 | 100 | | |
76 | 101 | | |
77 | 102 | | |
78 | 103 | | |
79 | 104 | | |
80 | | - | |
| 105 | + | |
81 | 106 | | |
82 | 107 | | |
83 | 108 | | |
| |||
162 | 187 | | |
163 | 188 | | |
164 | 189 | | |
165 | | - | |
166 | 190 | | |
167 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
168 | 197 | | |
169 | 198 | | |
170 | 199 | | |
171 | | - | |
| 200 | + | |
0 commit comments