This repo uses a layered architecture:
routingd(Rust daemon) is the protocol and route-computation core.node_supervisor(Rust) is the node-local process manager forroutingdand optional apps.- containerlab only manages topology/container lifecycle.
- Go traffic apps are experiment tools and are configured/injected per node.
- Docs index:
docs/README.md - Quickstart:
docs/quickstart.md - Unified experiments:
docs/unified-experiments.md - Results and validation:
docs/results-and-validation.md
make install
make build-routerd-rs
make build-traffic-app-go
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_ospf_multi_apps.yaml \
--poll-interval-s 1 \
--sudoValidate outputs:
python3 tools/validate_unified_metrics.py \
--input results/runs/unified_experiments \
--recursiveControl/observability paths:
- route computation + RIB/FIB snapshot:
src/irp/src/runtime/daemon.rs - kernel route install/readback (netlink via
ip route):src/irp/src/runtime/forwarding.rs - HTTP management API (
/v1/status,/v1/routes,/v1/fib,/v1/kernel-routes):src/irp/src/runtime/mgmt.rs - node-level process state (
/tmp/node_supervisor_state.json):src/irp/src/bin/node_supervisor.rs/v1/metricsnow includes layered IRP design tags inprotocol_metrics:design_profile,slow_state_scope,fast_state_scope,decision_layers
- Router daemon runtime (extensible core):
src/irp/src/runtime/daemon.rs - Protocol engines:
- OSPF-like link-state:
src/irp/src/protocols/ospf.rs - RIP distance-vector:
src/irp/src/protocols/rip.rs - ECMP equal-cost multipath baseline:
src/irp/src/protocols/ecmp.rs - Top-K random multipath baseline:
src/irp/src/protocols/topk.rs - DDR/DGR delay-aware routing core:
src/irp/src/protocols/ddr.rs- uses
tc -s qdiscbacklog when neighborifaceis present in config; converts queue bytes to delay vialink_bandwidth_bps(falls back to local estimator otherwise)
- uses
- Octopus (NSDI2026 profile) entry:
protocol: octopus(queue-aware stochastic multipath on DDR/DGR core) - IRP is the architecture/framework abstraction (not a runnable protocol instance)
- OSPF-like link-state:
- Decision/policy hook:
src/irp/src/algo/mod.rs - Management API:
src/irp/src/runtime/mgmt.rs(HTTP + gRPC placeholder) - Runtime entrypoint:
src/irp/src/main.rs(routingdbinary) - Node process supervisor:
src/irp/src/bin/node_supervisor.rs - Topology file loader + lab tools:
src/clab/clab_loader.py,src/clab/labgen.py - Example daemon configs:
experiments/routerd_examples/ - Experiment utilities index:
experiments/README.md
- Canonical automation entrypoints:
tools/ - Canonical experiment assets/configs:
experiments/
- Install project dependencies:
make install-
Ensure Docker is installed and running.
-
Install Rust musl target (required for router binaries running in router container image):
rustup target add x86_64-unknown-linux-musl- Install
containerlablocally (example):
mkdir -p "$HOME/.local/bin"
CLAB_VER=0.73.0
curl -fL -o /tmp/containerlab.tar.gz \
"https://github.com/srl-labs/containerlab/releases/download/v${CLAB_VER}/containerlab_${CLAB_VER}_linux_amd64.tar.gz"
tar -xzf /tmp/containerlab.tar.gz -C /tmp
install -m 0755 /tmp/containerlab "$HOME/.local/bin/containerlab"
containerlab versionIf needed:
export PATH="$HOME/.local/bin:$PATH"If musl link tools are missing on Debian/Ubuntu:
sudo apt-get update && sudo apt-get install -y musl-toolsPer container (router) run one daemon:
make build-routerd-rs
make run-routerd-rs ROUTERD_RS_CONFIG=/path/to/router.yaml ROUTERD_RS_LOG_LEVEL=INFORust extensibility points:
- protocol abstraction:
src/irp/src/protocols/base.rs - decision/policy hook:
src/irp/src/algo/mod.rs - rust core notes:
src/irp/README.md
OSPF example config: experiments/routerd_examples/ospf_router1.yaml
RIP example config: experiments/routerd_examples/rip_router1.yaml
router_id: 1
protocol: ospf # or rip/ecmp/topk/ddr/dgr/octopus
bind:
address: 0.0.0.0
port: 5500
timers:
tick_interval: 1.0
dead_interval: 4.0
neighbors:
- router_id: 2
address: 10.0.12.2
port: 5500
cost: 1.0
protocol_params:
ospf:
hello_interval: 1.0
lsa_interval: 3.0
forwarding:
enabled: false
dry_run: true
management:
http:
enabled: true
bind: 0.0.0.0
port: 18001
grpc:
enabled: true
bind: 0.0.0.0
port: 19001forwarding.enabled=true enables Linux route programming (ip route) based on protocol output.
You need destination_prefixes and next_hop_ips mappings in config to install concrete kernel routes.
Use the generator to create per-node routerd configs and a deploy env file.
Containerlab keeps using the original topology file directly:
make gen-routerd-lab LABGEN_PROFILE=ring6 LABGEN_PROTOCOL=ospfOr run directly:
python3 tools/generate_routerd_lab.py --profile star6 --protocol ripBuilt-in profiles (few common topologies, one-arg selection):
line3line5ring6abilenegeantuunetcernetstar6fullmesh4spineleaf2x4
For a custom topology file, use --topology-file:
python3 tools/generate_routerd_lab.py \
--protocol rip \
--topology-file src/clab/topologies/spineleaf2x4.clab.yaml--protocol is independent from topology file, so the same file can run ospf, rip, ecmp, topk, ddr, dgr, or octopus.
DDR validation config example:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_ddr_validation.yaml \
--poll-interval-s 1 \
--sudoECMP validation config example:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_ecmp_validation.yaml \
--poll-interval-s 1 \
--sudoTop-K random routing validation config example:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_topk_validation.yaml \
--poll-interval-s 1 \
--sudoDGR validation config example:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_dgr_validation.yaml \
--poll-interval-s 1 \
--sudoOctopus validation config example:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_octopus_validation.yaml \
--poll-interval-s 1 \
--sudodgr/octopus routing params support queue-level back-pressure controls:
queue_levelspressure_thresholdqueue_level_scale_msneighbor_state_max_age_s(optional;<= 0keeps auto freshness window)randomize_route_selectionrng_seed
For a quick end-to-end run (generate + deploy + health-check + destroy):
make run-routerd-lab LABGEN_PROFILE=ring6 LABGEN_PROTOCOL=ripKeep the lab after checks:
make run-routerd-lab LABGEN_PROFILE=ring6 LABGEN_PROTOCOL=rip RUNLAB_KEEP_LAB=1Generated assets are written under:
results/runs/routerd_labs/<lab_name>/configs/*.yamlresults/runs/routerd_labs/<lab_name>/deploy.env
deploy.env carries containerlab variable overrides (name/mgmt/image) and is consumed by
run-routerd-lab. Topology files in src/clab/topologies/ are parameterized with environment
variables, so the same source file can be reused across runs.
Permission deniedunderresults/runs/routerd_labs/...: this is usually caused by previoussudo makecreating root-owned files. Fix with:sudo chown -R "$USER:$USER" results/runs/routerd_labssudo: containerlab: command not found: yoursudoPATH may not include~/.local/bin. Use absolute path:sudo "$(which containerlab)" deploy -t <topology_file> --name <lab_name> --reconfigure
- Deploying the wrong run:
always use the
lab_nameanddeploy_env_fileprinted by your latestmake gen-routerd-lab. nohup: can't execute '/irp/bin/node_supervisor': No such file or directoryinside container: host-built Rust binary is not compatible with container libc. Rebuild with musl target:rustup target add x86_64-unknown-linux-musl make build-routerd-rs
After containerlab deploy, run:
make check-routerd-lab \
CHECK_TOPOLOGY_FILE=src/clab/topologies/ring6.clab.yaml \
CHECK_LAB_NAME=<lab_name> \
CHECK_CONFIG_DIR=results/runs/routerd_labs/<lab_name>/configs \
CHECK_USE_SUDO=1 \
CHECK_EXPECT_PROTOCOL=ospfWhat it checks per node:
- container is running,
routingdprocess exists (/irp/bin/routingdor compatibility alias),node_supervisorprocess exists and state file can be collected (/tmp/node_supervisor_state.json),- management HTTP API (
/v1/status) is reachable when configured, - management kernel-route API (
/v1/kernel-routes) can be queried, - neighbor IP ping (from generated config) succeeds,
- latest
RIB/FIB updatedroute count is at leastn_nodes-1(orCHECK_MIN_ROUTES).
By default checker waits up to 10s for early convergence logs (CHECK_MAX_WAIT_S).
Recommended (unified benchmark config):
make run-unified-experiment \
UNIFIED_CONFIG_FILE=experiments/routerd_examples/unified_experiments/ring6_ospf_convergence_benchmark.yaml \
UNIFIED_USE_SUDO=1For cernet convergence benchmark:
make run-unified-experiment \
UNIFIED_CONFIG_FILE=experiments/routerd_examples/unified_experiments/cernet_ospf_convergence_benchmark.yaml \
UNIFIED_USE_SUDO=1Legacy-compatible wrapper (still supported):
make run-ospf-convergence-exp EXP_TOPOLOGY_FILE=src/clab/topologies/ring6.clab.yaml EXP_REPEATS=1Direct script usage:
python3 tools/ospf_convergence_exp.py --topology-file src/clab/topologies/ring6.clab.yaml --repeats 1If your environment requires privilege escalation for Docker/containerlab:
make run-ospf-convergence-exp EXP_USE_SUDO=1Optional fixed management network/subnets:
make run-ospf-convergence-exp EXP_USE_SUDO=1 \
EXP_MGMT_NETWORK_NAME=clab-mgmt-romam \
EXP_MGMT_IPV4_SUBNET=10.250.10.0/24 \
EXP_MGMT_IPV6_SUBNET=fd00:fa:10::/64For spineleaf2x4 convergence tests:
make run-ospf-convergence-exp EXP_TOPOLOGY_FILE=src/clab/topologies/spineleaf2x4.clab.yamlThis repo includes a lightweight traffic app:
- go binary:
/irp/bin/traffic_app - roles:
sinkandsend - protocols:
udpandtcp - patterns:
bulkandonoff(for sender)
Recommended workflow:
- build router image once:
make build-routerd-node-image - run lab with generated topology (default image
romam/network-multitool-routerd:latest)
run-routerd-lab now prefers binaries already baked in image:
/irp/bin/routingd(or compatibility alias/irp/bin/irp_routerd_rs)/irp/bin/traffic_app
If image is missing binaries, script falls back to host build + copy.
Install Go on host and ensure go is in PATH, then build and install binary:
make build-traffic-app-go
make install-traffic-app-bin \
INSTALL_TRAFFIC_BIN_LAB_NAME=<lab_name>Optional architecture overrides (if needed):
make build-traffic-app-go TRAFFIC_GOOS=linux TRAFFIC_GOARCH=amd64 TRAFFIC_CGO_ENABLED=0If lab is started via run-routerd-lab, it already attempts this build+copy step automatically
(using env overrides ROMAM_TRAFFIC_GOOS, ROMAM_TRAFFIC_GOARCH, ROMAM_TRAFFIC_CGO_ENABLED).
make run-traffic-plan TRAFFIC_PLAN_FILE=experiments/routerd_examples/traffic_plans/line5_udp.yamlPlan runner:
- optional install step (
install_traffic_app_bin.py) - ordered per-node task launch (
run_traffic_app.py) - supports background sink and delayed sender start
Use one top-level YAML to run the full loop in either mode:
mode: scenario(default): deploy topology + inject protocol config, write app specs intonode_supervisor, launch apps, inject faults, and poll/v1/routes+/v1/metrics.mode: convergence_benchmark: run repeated deploy/precheck/ping-probe loops and export summary JSON/CSV.
Examples:
experiments/routerd_examples/unified_experiments/line3_ospf_multi_apps.yamlexperiments/routerd_examples/unified_experiments/line3_ospf_onoff_fault.yamlexperiments/routerd_examples/unified_experiments/line3_rip_validation.yamlexperiments/routerd_examples/unified_experiments/line3_octopus_validation.yamlexperiments/routerd_examples/unified_experiments/ring6_ospf_convergence_benchmark.yaml
Recommended run flow (verified on this repo):
make build-routerd-rs
make run-unified-experiment \
UNIFIED_CONFIG_FILE=experiments/routerd_examples/unified_experiments/line3_ospf_multi_apps.yaml \
UNIFIED_USE_SUDO=1Equivalent direct script run:
PYTHONPATH=src python3 tools/run_unified_experiment.py \
--config experiments/routerd_examples/unified_experiments/line3_ospf_multi_apps.yaml \
--poll-interval-s 1 \
--sudoRIP validation on 3-node line topology:
make run-unified-experiment \
UNIFIED_CONFIG_FILE=experiments/routerd_examples/unified_experiments/line3_rip_validation.yaml \
UNIFIED_USE_SUDO=1Default outputs:
mode: scenario:results/runs/unified_experiments/<lab_name>/report_<timestamp>.jsonmode: convergence_benchmark:results/tables/<protocol>_convergence_unified_<topology>.jsonand.csv- Standardized artifacts (both modes):
results/runs/.../<run_id>/config.yaml,topology.yaml,traffic.yaml,logs/,metrics.json,summary.md
Validate unified JSON outputs (lightweight schema checks):
python3 tools/validate_unified_metrics.py \
--input results/runs/unified_experiments \
--recursiveThe unified config supports fault injection, e.g.:
link_downwithfaults[].link: [r2, r3]app_stop/app_startwithfaults[].node+faults[].app(applied vianode_supervisor)
Optional queue-discipline runtime config:
qdisc.enabled: enable runtime qdisc controller inroutingdqdisc.dry_run: logtcoperations without applying themqdisc.default.kind: root qdisc kind (fifo,pfifo_fast,ecn,red,fq_codel,prio,drr,netem,tbf)qdisc.default.params: key/value params passed totc qdisc replace
Cleanup command (when you keep lab running or need manual cleanup):
sudo containerlab destroy -t src/clab/topologies/line3.clab.yaml --name <lab_name> --cleanuprun_unified_experiment.py supports:
mode:scenario(default) orconvergence_benchmarknode_apps: list of{node_id, apps[]}apps: flat list where each app includesnodeornode_id
Each app entry supports:
name,kind(sender/sink/custom)bin,args,envrestart(never/on-failure/always) andmax_restartsdelay_s,log_file, optionalcpu_affinity
Built-in guardrails:
- sink port conflict check on same node for
traffic_appsink endpoints - sender target guard (
localhost/127.0.0.1/::1rejected) - app lifecycle supervision + restart events included in report
- app env defaults:
APP_ID,NODE_ID,APP_ROLE - traffic app periodic stats can emit JSON when
TRAFFIC_LOG_JSON=1
make run-traffic-app \
TRAFFIC_LAB_NAME=<lab_name> \
TRAFFIC_NODE=r6 \
TRAFFIC_BACKGROUND=1 \
TRAFFIC_LOG_FILE=/tmp/udp_sink.log \
TRAFFIC_ARGS="sink --proto udp --bind 0.0.0.0 --port 9000 --report-interval-s 1"make run-traffic-app \
TRAFFIC_LAB_NAME=<lab_name> \
TRAFFIC_NODE=r1 \
TRAFFIC_ARGS="send --proto udp --target <r6_reachable_ip> --port 9000 --packet-size 256 --count 1000 --pps 200"Start TCP sink:
make run-traffic-app \
TRAFFIC_LAB_NAME=<lab_name> \
TRAFFIC_NODE=r6 \
TRAFFIC_BACKGROUND=1 \
TRAFFIC_LOG_FILE=/tmp/tcp_sink.log \
TRAFFIC_ARGS="sink --proto tcp --bind 0.0.0.0 --port 9001"Run TCP sender:
make run-traffic-app \
TRAFFIC_LAB_NAME=<lab_name> \
TRAFFIC_NODE=r1 \
TRAFFIC_ARGS="send --proto tcp --target <r6_reachable_ip> --port 9001 --packet-size 1024 --duration-s 10 --pps 500 --tcp-nodelay"make run-traffic-app \
TRAFFIC_LAB_NAME=<lab_name> \
TRAFFIC_NODE=r1 \
TRAFFIC_ARGS="send --proto udp --target <r6_reachable_ip> --port 9000 --pattern onoff --on-ms 2000 --off-ms 1000 --packet-size 1200 --duration-s 30 --pps 1000"Direct script usage is also supported:
python3 tools/run_traffic_app.py --lab-name <lab_name> --node r1 -- \
send --proto udp --target <ip> --port 9000 --count 100make run-traffic-probe \
PROBE_LAB_NAME=<lab_name> \
PROBE_SRC_NODE=r1 \
PROBE_DST_NODE=r6 \
PROBE_DST_IP=<r6_reachable_ip> \
PROBE_PROTO=udp \
PROBE_PACKET_SIZE=512 \
PROBE_COUNT=5000 \
PROBE_PPS=1000 \
PROBE_OUTPUT_JSON=results/runs/traffic_probe_r1_r6.jsonIt prints a JSON report with sender throughput and sink log tail.
- Per-run artifacts:
results/runs/ospf_convergence_containerlab/ - Aggregated results:
results/tables/ospf_convergence_containerlab_<topology>.jsonand.csv