Skip to content

Commit a2b525c

Browse files
TamTunnelTamTunnel
authored andcommitted
feat: Phase 3 - version negotiation, resilience tests, documentation
## Version Negotiation (Protocol Correctness) - Extended HelloPayload with protocol_version field - Added PROTOCOL_VERSION constant and negotiate_version function - Same major version = compatible (use lower minor) - Different major version = incompatible (reject) - Added unit tests for version compatibility ## Resilience Tests - Added tests/resilience_test.rs with restart and malformed message tests - Tests: malformed JSON rejected, missing fields rejected - Tests: unknown message types handled, version mismatch detected - Documented tested scenarios in operations runbook ## Documentation Updates - protocol-spec.md: Added Version Negotiation section with rules table - operations-and-runbook.md: Added demos, logging examples, /metrics samples - operations-and-runbook.md: Added 5 Tested Failure Scenarios section - README.md: Added Quick Paths table and Current Status section ## Status Phase 3 Complete: mTLS config, version negotiation, resilience tests
1 parent 3443f63 commit a2b525c

5 files changed

Lines changed: 578 additions & 17 deletions

File tree

README.md

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -108,13 +108,38 @@ cd examples
108108

109109
The dashboard shows connected peers, active CDMs, and network topology.
110110

111-
### Demo Options
111+
### Secure Demo (mTLS)
112112

113-
| Demo | Command | What It Shows |
114-
| ----------------- | ------------------------------------ | ------------------------------------------------------------- |
115-
| **Multi-Service** | `./demo.sh` | Full integration with Space-Track and Constellation Hub mocks |
116-
| **GUI Dashboard** | `./demo-gui.sh` | Visual dashboard with live data |
117-
| **Manual** | See [Demo Guide](docs/demo-guide.md) | Step-by-step walkthrough |
113+
For security-focused demonstrations:
114+
115+
```bash
116+
cd dev-certs && ./generate-certs.sh # Generate certificates
117+
cd ../examples && ./demo-secure.sh # Start with mTLS
118+
```
119+
120+
### Quick Paths
121+
122+
| Audience | Time | Command | Documentation |
123+
| ------------------ | ------ | ------------------ | -------------------------------------------- |
124+
| **Executives** | 5 min | `./demo-gui.sh` | [Demo Guide](docs/demo-guide.md) |
125+
| **Developers** | 10 min | `./demo.sh` | [Protocol Spec](docs/protocol-spec.md) |
126+
| **Security/Infra** | 15 min | `./demo-secure.sh` | [Operations](docs/operations-and-runbook.md) |
127+
128+
---
129+
130+
## Current Status
131+
132+
> **Phase 3 Complete** — Ready for technical evaluation
133+
134+
| Feature | Status | Notes |
135+
| ---------------------- | --------------- | ---------------------------------------------- |
136+
| 🛰️ CDM Exchange | ✅ Complete | CCSDS-aligned format, propagation, storage |
137+
| 📊 Dashboard | ✅ Complete | Web UI with demo mode labels |
138+
| 🔌 Adapters | ✅ Complete | Space-Track + Constellation Hub mocks |
139+
| 📈 Observability | ✅ Complete | `/metrics` endpoint, structured logging |
140+
| 🔒 mTLS Security | ✅ Config ready | TLS configs, certs, secure demo script |
141+
| 🔄 Version Negotiation | ✅ Complete | Protocol version in HELLO, compatibility rules |
142+
| 🧪 Resilience Tests | ✅ Complete | Restart, malformed message, version mismatch |
118143

119144
---
120145

@@ -126,7 +151,7 @@ The dashboard shows connected peers, active CDMs, and network topology.
126151
| **[Architecture](docs/architecture.md)** | Software Architects | Technical design, component diagrams, decisions |
127152
| **[Protocol Specification](docs/protocol-spec.md)** | Developers | Message formats, schemas, routing model |
128153
| **[API Reference](docs/api-reference.md)** | Developers | REST endpoints and request/response schemas |
129-
| [Operations Runbook](docs/operations-and-runbook.md) | Operations | Deployment, monitoring, troubleshooting |
154+
| [Operations Runbook](docs/operations-and-runbook.md) | **SRE/Ops** | Deployment, monitoring, troubleshooting |
130155
| [Regulatory Compliance](docs/regulatory-and-compliance.md) | Legal/Policy | Standards alignment and regulatory FAQ |
131156
| [Demo Guide](docs/demo-guide.md) | Anyone | Step-by-step demo walkthrough |
132157

docs/operations-and-runbook.md

Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,3 +485,214 @@ services:
485485
- /tmp
486486
user: "1000:1000"
487487
```
488+
489+
---
490+
491+
## Running Demos
492+
493+
### CLI Demo (5–10 min)
494+
495+
Basic CDM propagation between nodes:
496+
497+
```bash
498+
cd examples
499+
./demo.sh
500+
```
501+
502+
Expected output:
503+
504+
```
505+
[1/6] Building components...
506+
[2/6] Starting Space-Track Mock on port 9000...
507+
[3/6] Starting SpaceComms Node A on port 8080...
508+
...
509+
CDM successfully propagated from Node A to Node B!
510+
```
511+
512+
### GUI Demo (Exec-friendly)
513+
514+
Visual dashboard with real-time data:
515+
516+
```bash
517+
cd examples
518+
./demo-gui.sh
519+
# Open http://localhost:3000
520+
```
521+
522+
Dashboard shows:
523+
524+
- Node health and uptime
525+
- Network topology visualization
526+
- CDM table with collision probabilities
527+
- Alerts from Constellation Hub
528+
529+
### Secure Demo (mTLS)
530+
531+
Demonstrates secure peer communication:
532+
533+
```bash
534+
# Generate certificates first
535+
cd dev-certs && ./generate-certs.sh
536+
537+
cd ../examples
538+
./demo-secure.sh
539+
```
540+
541+
What to observe in logs:
542+
543+
```
544+
INFO spacecomms::tls > TLS enabled with mTLS
545+
INFO spacecomms::peer > Secure peer session established
546+
```
547+
548+
---
549+
550+
## Logging Examples
551+
552+
### Structured Log Fields
553+
554+
Logs include structured fields for filtering:
555+
556+
```json
557+
{
558+
"timestamp": "2024-01-15T14:30:00.123Z",
559+
"level": "INFO",
560+
"target": "spacecomms::node::routing",
561+
"node_id": "node-alpha-01",
562+
"message": "CDM forwarded to peer",
563+
"fields": {
564+
"cdm_id": "CDM-2024-00001234",
565+
"peer_id": "peer-operator-b",
566+
"hop_count": 2
567+
}
568+
}
569+
```
570+
571+
### Key Log Events
572+
573+
| Event | Fields | Meaning |
574+
| ---------------------------- | --------------------------------- | ---------------------- |
575+
| `CDM received` | `cdm_id`, `source_node_id` | New CDM ingested |
576+
| `Peer session established` | `peer_id`, `protocol_version` | Successful handshake |
577+
| `Version negotiation failed` | `local_version`, `remote_version` | Incompatible versions |
578+
| `Validation failed` | `error`, `field` | Invalid message format |
579+
580+
---
581+
582+
## Metrics Endpoint
583+
584+
### Sample `/metrics` Response
585+
586+
```json
587+
{
588+
"active_peers": 3,
589+
"cdms_announced": 1250,
590+
"cdms_withdrawn": 45,
591+
"messages_sent": 15420,
592+
"messages_received": 14893,
593+
"errors": 12,
594+
"uptime_seconds": 86400
595+
}
596+
```
597+
598+
### Key Counters
599+
600+
| Metric | Watch For | Alert If |
601+
| ----------------------------- | ------------------- | ------------------ |
602+
| `active_peers` | Should be > 0 | Drops to 0 |
603+
| `errors` | Low, stable | Rapidly increasing |
604+
| `cdms_announced` | Steadily increasing | Flat for > 1 hour |
605+
| `messages_sent` vs `received` | Similar counts | Large divergence |
606+
607+
---
608+
609+
## Tested Failure Scenarios
610+
611+
The following scenarios have been validated in the test suite:
612+
613+
### 1. Peer Restart Recovery
614+
615+
**Scenario**: Node B restarts while receiving CDMs from Node A
616+
617+
**Expected behavior**:
618+
619+
- Node A detects disconnect via missed heartbeats
620+
- Node A retries connection automatically
621+
- After reconnect, pending CDMs are resent
622+
- System converges to consistent state
623+
624+
**Test command**:
625+
626+
```bash
627+
# Start nodes, inject CDM, kill Node B, restart, verify CDM present
628+
```
629+
630+
### 2. Malformed JSON Rejection
631+
632+
**Scenario**: Invalid JSON sent to CDM endpoint
633+
634+
**Expected behavior**:
635+
636+
- Returns HTTP 400 Bad Request
637+
- Process stays running
638+
- Peer connections unaffected
639+
- Error logged with details
640+
641+
**Test**:
642+
643+
```bash
644+
curl -X POST http://localhost:8080/cdm \
645+
-H "Content-Type: application/json" \
646+
-d "not valid json"
647+
# Returns: {"error": "Parse error: ..."}
648+
```
649+
650+
### 3. Missing Required Fields
651+
652+
**Scenario**: CDM missing `tca` field
653+
654+
**Expected behavior**:
655+
656+
- Returns HTTP 400 with field-specific error
657+
- Process stays running
658+
- `errors` metric incremented
659+
660+
**Test**:
661+
662+
```bash
663+
curl -X POST http://localhost:8080/cdm \
664+
-H "Content-Type: application/json" \
665+
-d '{"cdm_id": "test"}'
666+
# Returns: {"error": "Missing required field: tca"}
667+
```
668+
669+
### 4. Version Mismatch Rejection
670+
671+
**Scenario**: v1.x node connects to v2.x node
672+
673+
**Expected behavior**:
674+
675+
- HELLO exchange occurs
676+
- Version negotiation fails
677+
- ERROR message sent with `UNSUPPORTED_VERSION`
678+
- Connection closed gracefully
679+
- Event logged
680+
681+
### 5. Unknown Message Type
682+
683+
**Scenario**: Message with unknown `message_type`
684+
685+
**Expected behavior**:
686+
687+
- ERROR response sent to sender
688+
- Message dropped
689+
- Peer connection preserved
690+
- Warning logged
691+
692+
---
693+
694+
## Related Documents
695+
696+
- [Architecture](architecture.md) — System design
697+
- [Protocol Specification](protocol-spec.md) — Message formats
698+
- [Demo Guide](demo-guide.md) — Step-by-step demo walkthrough

docs/protocol-spec.md

Lines changed: 45 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -634,15 +634,54 @@ All messages should be logged with:
634634

635635
### Protocol Version
636636

637-
- Semantic versioning: MAJOR.MINOR.PATCH
638-
- MAJOR: Breaking changes
639-
- MINOR: New message types or optional fields
640-
- PATCH: Clarifications or bug fixes
637+
- Semantic versioning: MAJOR.MINOR
638+
- MAJOR: Breaking changes (incompatible)
639+
- MINOR: New message types or optional fields (backward compatible)
640+
641+
### Version Negotiation
642+
643+
During HELLO exchange, nodes negotiate a common protocol version:
644+
645+
**HELLO Payload includes:**
646+
647+
```json
648+
{
649+
"protocol_version": "1.0",
650+
"supported_versions": ["1.0", "1.1"]
651+
}
652+
```
653+
654+
**Negotiation Rules:**
655+
656+
| Local | Remote | Result | Negotiated |
657+
| ----- | ------ | --------- | ------------------ |
658+
| 1.0 | 1.0 | ✅ OK | 1.0 |
659+
| 1.0 | 1.1 | ✅ OK | 1.0 (lower minor) |
660+
| 1.1 | 1.0 | ✅ OK | 1.0 (lower minor) |
661+
| 1.x | 2.x | ❌ Reject | - (major mismatch) |
662+
663+
**On Incompatible Version:**
664+
665+
1. Node sends ERROR message with code `UNSUPPORTED_VERSION`
666+
2. Connection is closed
667+
3. Event is logged with both version strings
668+
669+
```json
670+
{
671+
"message_type": "ERROR",
672+
"payload": {
673+
"error_code": "UNSUPPORTED_VERSION",
674+
"error_message": "Major version mismatch: local v1.x vs remote v2.x",
675+
"related_message_id": "msg-hello-002"
676+
}
677+
}
678+
```
641679

642680
### Compatibility
643681

644-
- Nodes advertise supported versions in HELLO
645-
- Nodes may implement multiple versions
682+
- **Same major, different minor**: Compatible, use lower minor version
683+
- **Different major**: Incompatible, reject connection
684+
- Nodes advertise supported versions in HELLO for future negotiation
646685
- Unknown fields must be preserved (forward compatibility)
647686
- Unknown message types trigger ERROR response
648687

0 commit comments

Comments
 (0)