|
| 1 | +# Trustline Manager — Security & Performance Audit |
| 2 | + |
| 3 | +Scope: `src/lib/trustline-manager.js`, `src/routes/trustlines.js`, and database migrations. |
| 4 | + |
| 5 | +## Implementation Status |
| 6 | + |
| 7 | +### Issue #744: Cryptographic Signature Verification ✓ |
| 8 | +**Status**: Complete |
| 9 | + |
| 10 | +**Implementation**: `TrustlineSignatureVerifier` class provides: |
| 11 | +- Multi-signature account verification with threshold checking |
| 12 | +- Ed25519 signature verification using Stellar SDK |
| 13 | +- Transaction operation validation (changeTrust, allowTrust) |
| 14 | +- Asset code and issuer validation |
| 15 | +- 5-minute verification result caching to reduce Horizon API calls |
| 16 | +- Comprehensive error handling with specific failure reasons |
| 17 | + |
| 18 | +**Controls**: |
| 19 | +- All signatures verified against Stellar public keys |
| 20 | +- Transaction operations inspected to ensure trustline-specific operations |
| 21 | +- Asset codes and issuers validated against Stellar standards |
| 22 | +- No cache poisoning: signatures validated on each first lookup |
| 23 | + |
| 24 | +**Verification Endpoint**: `POST /trustlines/verify/:txHash` |
| 25 | + |
| 26 | +--- |
| 27 | + |
| 28 | +### Issue #743: Rate Limiting ✓ |
| 29 | +**Status**: Complete |
| 30 | + |
| 31 | +**Implementation**: `TrustlineRateLimiter` class provides: |
| 32 | +- Per-merchant, per-API-key, or per-IP rate limiting |
| 33 | +- Trustline operations: 20 ops/5min per merchant |
| 34 | +- Trustline verifications: 50 verifications/5min per merchant |
| 35 | +- Premium/enterprise merchants skip standard rate limits |
| 36 | +- Standard HTTP headers: X-RateLimit-{Limit, Remaining, Reset} |
| 37 | + |
| 38 | +**Controls**: |
| 39 | +- Redis-backed distributed rate limiting |
| 40 | +- Graceful degradation: requests pass through if Redis unavailable |
| 41 | +- Key hierarchy: merchant ID > API key hash > IP address |
| 42 | +- Separate rate limit tracks for operations vs. verifications |
| 43 | + |
| 44 | +**Protected Endpoints**: |
| 45 | +- `POST /trustlines/verify/:txHash` - verification rate limit |
| 46 | +- `POST /trustlines/create` - operation rate limit |
| 47 | +- `GET /trustlines/...` - operation rate limit |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +### Issue #741: Error Recovery ✓ |
| 52 | +**Status**: Complete |
| 53 | + |
| 54 | +**Implementation**: `TrustlineErrorRecovery` class provides: |
| 55 | +- Per-context circuit breaker pattern with half-open probes |
| 56 | +- Exponential backoff with 25% jitter (max 30 sec) |
| 57 | +- Timeout wrapper: 15-second operation timeout per attempt |
| 58 | +- Dead-letter queue (100 items max) for unrecoverable failures |
| 59 | +- Retry logic: 3 attempts by default, 1 for circuit breaker probes |
| 60 | +- Error classification: retryable (network, timeout, rate-limit) vs. terminal (auth, schema, 404) |
| 61 | + |
| 62 | +**Controls**: |
| 63 | +- Circuit breaker thresholds: 5 consecutive failures trigger open state |
| 64 | +- Half-open: one probe attempt allowed after 30-second cool-off |
| 65 | +- Timeout per operation: 15 seconds hard cap |
| 66 | +- Dead-letter queue: FIFO eviction when full (100 items) |
| 67 | +- Metrics tracking: total failures, recoveries, last error timestamp |
| 68 | + |
| 69 | +**Error Classification**: |
| 70 | +- **Retryable**: network, timeout, rate-limit (429), server errors (5xx) |
| 71 | +- **Terminal**: auth errors (401/403), schema conflicts, 404 not found, client errors (4xx) |
| 72 | + |
| 73 | +**Monitoring**: Access circuit breaker metrics via `TrustlineErrorRecovery.getCircuitBreakerMetrics()` |
| 74 | + |
| 75 | +--- |
| 76 | + |
| 77 | +### Issue #742: Security Audit (SEP-12 KYC) ✓ |
| 78 | +**Status**: Complete |
| 79 | + |
| 80 | +**See**: [SEP12_KYC_SECURITY_AUDIT.md](./SEP12_KYC_SECURITY_AUDIT.md) for detailed threat model. |
| 81 | + |
| 82 | +**Enhancement**: Rate limiting now applied to all SEP-12 routes (addresses audit recommendation): |
| 83 | +- Rate limit: 50 requests/15min per account+IP combination |
| 84 | +- Protects against brute-force enumeration of KYC status |
| 85 | +- Standard HTTP headers included in responses |
| 86 | + |
| 87 | +**Database Access**: |
| 88 | +- `sep12_kyc_customers` table with unique composite index `(stellar_account, memo)` |
| 89 | +- `withRecovery` wrapper ensures all DB operations have retry logic |
| 90 | +- Parameterized queries prevent SQL injection |
| 91 | + |
| 92 | +--- |
| 93 | + |
| 94 | +## Threat Model & Residual Risks |
| 95 | + |
| 96 | +| Threat | Control | Mitigation | |
| 97 | +| --- | --- | --- | |
| 98 | +| **Brute-force signature guessing** | Rate limiting (50 verifications/5min) + circuit breaker | Attackers rate-limited; failed probes trigger circuit breaker | |
| 99 | +| **Denial of service (Horizon API)** | Rate limit + circuit breaker + timeout | Limits per-operation load; circuit breaker stops cascading failures | |
| 100 | +| **Invalid/malicious signatures** | Ed25519 verification + operation type validation | Only Stellar account holders can verify; operations inspected | |
| 101 | +| **Network timeouts/flaky APIs** | Exponential backoff + circuit breaker + dead-letter queue | Automatic retry on transient failures; circuit breaker prevents hammering | |
| 102 | +| **Asset issuer spoofing** | Asset code/issuer validation + Stellar SDK validation | Validated against Stellar standards; SDK prevents invalid keys | |
| 103 | +| **Cache poisoning** | Cache only after successful verification | Verification rerun on cache misses; no false positives | |
| 104 | +| **PII in logs** | Error messages sanitized (no field values logged) | Error codes only; internal details never exposed | |
| 105 | + |
| 106 | +**Residual Risks**: |
| 107 | +- **Horizon API availability**: Relies on external Stellar API; circuit breaker mitigates cascading failures |
| 108 | +- **Redis availability**: Graceful degradation; rate limiting bypassed if Redis unavailable |
| 109 | +- **Database scaling**: Monitor connection pool; use `queryWithRetry` for automatic retry |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +## Test Coverage |
| 114 | + |
| 115 | +### Trustline Manager Tests (`src/lib/trustline-manager.test.js`) |
| 116 | +- **Signature verification**: valid/invalid signatures, operation type validation, asset validation, caching |
| 117 | +- **Rate limiting**: merchant/API-key/IP key generation, premium tier bypass |
| 118 | +- **Error recovery**: retry logic, error classification, circuit breaker state transitions, timeout handling, dead-letter queue |
| 119 | +- **Query optimization**: index creation, health metrics, payment statistics by asset |
| 120 | + |
| 121 | +### SEP-12 KYC Tests (`src/lib/sep12-kyc.test.js`) |
| 122 | +- **Signature verification**: valid/forged/stale/wrong-key signatures |
| 123 | +- **Field validation**: schema strictness, unknown field rejection |
| 124 | +- **Error recovery**: retryable 503 vs. terminal 500 errors |
| 125 | +- **Database operations**: parameterized queries, upsert behavior, get/delete hit/miss |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## Deployment Checklist |
| 130 | + |
| 131 | +- [ ] Redis configured and available for rate limiting |
| 132 | +- [ ] Database migrations applied (`20260527*` series) |
| 133 | +- [ ] Rate limit keys properly configured in routes |
| 134 | +- [ ] Circuit breaker metrics exposed to monitoring |
| 135 | +- [ ] Dead-letter queue monitored for failures |
| 136 | +- [ ] Stellar Horizon API URL configured |
| 137 | +- [ ] Error logs do not leak PII (audit field values) |
| 138 | +- [ ] Timeout values tuned for production latency SLAs |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## Performance Metrics |
| 143 | + |
| 144 | +- **Signature verification**: ~200ms (Horizon API call) + caching (5 min) |
| 145 | +- **Rate limit check**: ~5ms (Redis lookup) |
| 146 | +- **Error recovery**: Exponential backoff: 1s → 2s → 4s (max 30s) |
| 147 | +- **Circuit breaker**: 5 failure threshold, 30-sec reset window, 1 probe allowed in half-open |
0 commit comments