You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add --subnets flag to deploy multiple nodes per client (#136)
* feat: add isAggregator flag to validator configuration
Add support for configuring nodes as aggregators through validator-config.yaml.
This allows selective designation of nodes to perform aggregation duties by
setting isAggregator: true in the validator configuration.
Changes:
- Add isAggregator field (default: false) to all validators in both local and ansible configs
- Update parse-vc.sh to extract and export isAggregator flag
- Modify all client command scripts to pass --is-aggregator flag when enabled
- Add isAggregator status to node information output
* spin-node: add --subnets flag to deploy multiple nodes per client
Adds --subnets N (1–5) to deploy N nodes of each client on their
associated servers, each on a distinct attestation subnet.
New files:
- generate-subnet-config.py: expands validator-config.yaml into
validator-config-subnets-N.yaml with unique node names, incremented
ports (quic/metrics/api), fresh P2P private keys, and explicit subnet
membership per entry. Also sets config.attestation_committee_count = N
so each client correctly partitions validators across N committees.
Changes:
- parse-env.sh: add --subnets N and --dry-run flags
- spin-node.sh:
- expand validator-config before genesis setup when --subnets N given
- select one aggregator per subnet randomly; print prominent summary
- --dry-run: simulate full deployment without applying any changes
(Ansible runs with --check --diff, local execs are echoed only)
- run-ansible.sh: pass validator_config_basename extra var so playbooks
use the active (possibly expanded) config; add --check --diff in dry-run
- ansible/playbooks/deploy-nodes.yml: use validator_config_basename to
sync the correct config file to remote hosts
- ansible/playbooks/prepare.yml: open port ranges for all subnet nodes
on a host by matching entries via IP, not just hostname
- convert-validator-config.py: fall back to httpPort for Lantern nodes
when generating Leanpoint upstreams
- README.md: document --subnets and --dry-run; update --prepare firewall
table to reflect port ranges when --subnets N is active
Rules enforced by generate-subnet-config.py:
- No two nodes on the same server may share a subnet (template validated)
- Each subnet has exactly one node per client
- N=1 is a no-op expansion (single-subnet baseline)
- N capped at 5
* ansible: copy only the node's own hash-sig keys to each server
Previously both deploy-nodes.yml and copy-genesis.yml synced the entire
hash-sig-keys/ directory to every remote host, meaning every server
received every validator's sk/pk pair.
Now each playbook:
1. Reads annotated_validators.yaml on the controller to look up the
privkey_file entries for the node being deployed (inventory_hostname).
2. Derives the pk filename by replacing _sk.ssz → _pk.ssz.
3. Copies only those specific files to the target host.
A server running zeam_0 (validator_0_sk.ssz / validator_0_pk.ssz) no
longer receives validator_1_sk.ssz, validator_2_sk.ssz, etc.
* spin-node: assert exactly 1 aggregator per subnet after selection
* validator-config: add privkey for commented-out gean_0, lean_node_0, peam_0
* spin-node: derive subnet from config 'subnet' field, not node name suffix
The old suffix-based detection (ethlambda_1 → subnet 1) broke when a
config contained multiple nodes for the same client without --subnets
(e.g. ethlambda_0..4 for redundancy), incorrectly creating 5 subnets
and forcing ethlambda nodes as the sole aggregator on subnets 1-4.
Subnet membership is now read from the explicit 'subnet:' field that
generate-subnet-config.py writes for each entry. Nodes without this
field (all standard configs) default to subnet 0, so a single-subnet
deployment always selects exactly one aggregator from all active nodes
regardless of numeric suffixes in their names.
* docs: add client integration guide with link from README
* spin-node: honour pre-existing isAggregator: true when no --aggregator flag is passed
Previously the script always reset all flags and randomly re-selected an
aggregator, ignoring any manual isAggregator: true already set in the
YAML. This caused ethlambda_0 (user's choice) to be silently replaced by
ethlambda_1 (random pick).
Aggregator selection now follows a three-level priority:
1. --aggregator <node> CLI flag
2. Pre-existing isAggregator: true in the config (manual YAML edit)
3. Random selection (fallback when neither is set)
The preset node is validated against the active node list. If it no
longer exists a warning is printed and random selection takes over.
* docs: clarify touch point 1 — both configs required, separate local/ansible examples
* docs: add note to contact zeam team for server IP assignment
* spin-node: fix associative array for bash 3.2 compatibility
* validator-config: use apiPort for lantern instead of httpPort
* fix: cadvisor deploy
* prepare: install jq alongside yq and docker
* fix: grandine address flag
* fix: grandine address flag ansible
* spin-node: skip aggregator selection when using --restart-client
* validator-config: enable gean_0 node
* run-ansible: derive inventory groups dynamically instead of hardcoding
The hardcoded group list (zeam_nodes, ream_nodes, ...) caused newly added
client types (e.g. gean_nodes) to never have their ansible_user updated.
This meant --useRoot was silently ignored for those nodes, causing Ansible
to SSH as the current local user (partha) instead of root, and fail.
* validator-config: add nlean_0 node
* ansible: add gean and nlean roles and wire into deploy
* docs: update adding-a-new-client guide with gean and nlean
* nlean: remove --pull=always for locally-built image
* nlean: use ghcr.io/nleaneth/nlean:latest as docker image
* fix: enable metrics flag for nlean
---------
Co-authored-by: Katya Ryazantseva <sibkatya@gmail.com>
- ✅ Configure to run clients in docker or binary mode for easy development
16
16
- ✅ Linux & Mac compatible & tested
17
17
- ✅ Option to operate on single or multiple nodes or `all`
@@ -212,10 +212,17 @@ Every Ansible deployment automatically deploys an observability stack alongside
212
212
15.`--prepare` verify and install the software required to run lean nodes on every remote server, and open + persist the necessary firewall ports.
213
213
-**Ansible mode only** — fails with an error if `deployment_mode` is not `ansible`
214
214
- Installs: `python3` (Ansible requirement), Docker CE + Compose plugin (all clients run as containers), `yq` (required by the `common` role at every deploy)
215
-
- Opens per-node ports (`quicPort`/UDP, `metricsPort`/TCP, `apiPort`/TCP) read from `validator-config.yaml`, plus fixed observability ports (9090, 9080, 9098, 9100). Enables `ufw` with default deny incoming (persisted across reboots).
215
+
- Opens per-node ports (`quicPort`/UDP, `metricsPort`/TCP, `apiPort`/TCP) read from the active validatorconfig, plus fixed observability ports (9090, 9080, 9098, 9100). With `--subnets N`, all N nodes' port ranges are opened per host. Enables `ufw` with default deny incoming (persisted across reboots).
216
216
- Prints a per-tool, per-host status summary (`✅ ok` / `❌ missing`) and `ufw status verbose`
217
-
-`--node` is not required and is ignored; all other flags are also ignored except `--sshKey` and `--useRoot`
217
+
-`--node` is not required; passing unsupported flags alongside `--prepare` produces a prominent error — only `--sshKey` and `--useRoot` are accepted
16.`--subnets N` expand the validator config to deploy N nodes of each client on the same server, where N is 1–5.
220
+
- Generates `validator-config-subnets-N.yaml` from the template (without modifying the original)
221
+
- Each subnet node gets a unique name (`{client}_0`, `{client}_1`, …), ports incremented by the subnet index, and a fresh P2P identity key for subnets > 0
222
+
- Subnet assignment rule: each server contributes **exactly one node per subnet** — nodes on the same server are never in the same subnet
223
+
- Every subnet contains the same set of client types
224
+
-`N=1` renames nodes to `{client}_0` with no port changes (useful for canonical naming)
- Only works in ansible mode (`deployment_mode: ansible` in your config, or `--deploymentMode ansible`)
240
-
-Any other flags (e.g.,`--node`, `--generateGenesis`) are silently ignored — only `--sshKey` and `--useRoot` are used
247
+
-Passing unsupported flags (e.g. `--node`, `--generateGenesis`) alongside `--prepare` produces a prominent error — only `--sshKey` and `--useRoot` are accepted
241
248
-`--node` is not required; the playbook runs on all remote hosts in the inventory
242
249
243
250
Once preparation succeeds, proceed with the normal deploy command:
@@ -246,6 +253,43 @@ Once preparation succeeds, proceed with the normal deploy command:
246
253
NETWORK_DIR=ansible-devnet ./spin-node.sh --node all --generateGenesis --sshKey ~/.ssh/id_ed25519 --useRoot
247
254
```
248
255
256
+
### Deploying multiple subnets
257
+
258
+
Use `--subnets N` to run N independent copies of each client on the same server. This is useful for testing multi-subnet P2P scenarios without provisioning additional machines.
259
+
260
+
```sh
261
+
# Deploy 3 subnets of every client (ansible)
262
+
NETWORK_DIR=ansible-devnet ./spin-node.sh --node all --subnets 3 \
`--subnets N` generates `validator-config-subnets-N.yaml` from the template (the original file is never modified). For each client in the template it creates N entries:
269
+
270
+
| Subnet index | Name | quicPort | metricsPort | apiPort |
Checkpoint sync lets you restart clients by syncing from a remote checkpoint instead of from genesis. This is useful for joining an existing network (e.g., leanpoint mainnet) without replaying the full chain.
> **Note:** All clients accept `--checkpoint-sync-url`. Client implementations may use different parameter names internally; update client-cmd scripts if parameters change.
284
328
@@ -293,9 +337,13 @@ Current following clients are supported:
293
337
5. Lighthouse
294
338
6. Grandine
295
339
7. Ethlambda
296
-
8. Peam
340
+
8. Gean
341
+
9. Nlean
342
+
10. Peam
343
+
344
+
Adding a new client requires 6 small, well-defined steps. See the full integration guide:
297
345
298
-
However adding a lean client to this setup is very easy. Feel free to do the PR or reach out to the maintainers.
346
+
📖 **[Adding a New Client](docs/adding-a-new-client.md)**
│ ├── copy-genesis.yml # Copy genesis files to remote hosts
@@ -846,13 +894,13 @@ The command runs `ansible/playbooks/prepare.yml` against all remote hosts in the
846
894
847
895
**Firewall rules opened (via `ufw`):**
848
896
849
-
Each host's ports are read directly from `validator-config.yaml`, so only the ports actually configured for that node are opened:
897
+
Ports are read from the active validator config (the `--subnets`-expanded file when `--subnets N` is used, or `validator-config.yaml` otherwise). Entries are matched by IP address, so all N subnet nodes on a server are found and all their ports are opened:
850
898
851
899
| Port | Protocol | Source |
852
900
|---|---|---|
853
-
|`quicPort`| UDP | Per-node — QUIC/P2P transport (e.g. 9001) |
0 commit comments