Skip to content

fix: do not subscribe to subnets when not aggregating#265

Merged
pablodeymo merged 2 commits intomainfrom
fix/non-aggregator-attestation-subscription
Apr 8, 2026
Merged

fix: do not subscribe to subnets when not aggregating#265
pablodeymo merged 2 commits intomainfrom
fix/non-aggregator-attestation-subscription

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

@MegaRedHand MegaRedHand commented Apr 8, 2026

This PR removes subnet subscribing when we're not an aggregator.

It was already fixed in #160, but was re-introduced in #249

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Kimi Code Review

The PR correctly implements the distinction between aggregator and non-aggregator subscription behavior per Ethereum consensus networking specs.

Assessment: Correct and idiomatic

Highlights

  1. Logic correctness (lines 212–228): Restricting attestation subnet subscriptions to aggregators only is the correct behavior. Non-aggregators publish via gossipsub fanout and don't need to receive unaggregated attestations, reducing unnecessary mesh traffic.

  2. Topic cache construction (lines 233–237): Building the attestation_topics cache for all validators (regardless of aggregator status) ensures non-aggregators can still publish to their assigned subnets without per-publish allocation. The use of entry().or_insert_with() is idiomatic and efficient.

  3. Metrics preservation: The metric calculation using attestation_topics.keys() maintains backward compatibility while correctly reflecting the validator set's subnet distribution.

Pre-existing issue (not blocking)

Line 214, 234: Division by zero risk if config.attestation_committee_count is 0:

let subnet_id = vid % config.attestation_committee_count;

While this is a protocol invariant that should never be 0 in practice, consider defensively validating this at config load time or using checked_rem to avoid panic. This issue predates the PR.

Minor suggestion

Line 219: The aggregate_subnets HashSet could avoid allocation when config.aggregate_subnet_ids is None and the validator set is empty (edge case), but the current clarity is preferable given this is one-time initialization code.

Conclusion: Ship it. The change correctly reduces subscription overhead for non-aggregator validators while maintaining all necessary publishing capabilities.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 8, 2026

Greptile Summary

This PR fixes unnecessary attestation-subnet gossipsub mesh memberships: non-aggregating nodes no longer subscribe to attestation subnets and instead rely on gossipsub fanout for publishing. The topic cache (attestation_topics) is correctly separated into its own loop over validator_ids.

  • P1: attestation_topics no longer includes topics for aggregate_subnet_ids. For aggregators where aggregate_subnet_ids contains subnets not covered by their own validators, any downstream publish path that looks up a topic by those subnet IDs will find no entry.

Confidence Score: 3/5

Mostly safe, but the P1 finding about attestation_topics missing aggregate_subnet_ids subnets should be confirmed before merging.

The core logic of restricting subscriptions to aggregators is sound and well-commented. Score is 3 rather than 4 because of the unconfirmed behavioral change where aggregate_subnet_ids subnets are no longer added to attestation_topics, which could silently break publish paths for aggregators using explicit subnet IDs.

crates/net/p2p/src/lib.rs — specifically the attestation_topics building loop (lines 231–238) and any gossipsub publish code that indexes into this map by subnet ID.

Vulnerabilities

No security concerns identified. The change only affects gossipsub subscription and topic-cache building logic, with no authentication, authorization, or data-exposure implications.

Important Files Changed

Filename Overview
crates/net/p2p/src/lib.rs Refactors attestation subnet subscriptions so only aggregators join the gossipsub mesh; non-aggregators now rely on fanout publishing. Topic cache building is separated from subscription logic and correctly covers all validator-owned subnets.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[build_swarm called] --> B[Subscribe: block_topic - all nodes]
    B --> C[Subscribe: aggregation_topic - all nodes]
    C --> D{is_aggregator?}
    D -- yes --> E[Build aggregate_subnets from validator_ids + aggregate_subnet_ids]
    E --> F{aggregate_subnets empty?}
    F -- yes --> G[Fallback: add subnet 0]
    F -- no --> H[Subscribe to each subnet in gossipsub mesh]
    G --> H
    D -- no --> I[Skip subnet subscriptions - fanout only]
    H --> J[Build attestation_topics cache from validator_ids only]
    I --> J
    J --> K[Set metric_subnet from attestation_topics]
    K --> L[Return BuiltSwarm]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/net/p2p/src/lib.rs
Line: 231-238

Comment:
**`attestation_topics` no longer covers `aggregate_subnet_ids`**

In the old code, the topic cache was built from `subscribe_subnets`, which for aggregators included both validator-derived subnets and any `aggregate_subnet_ids`. The new code builds it only from `config.validator_ids`. This is a silent behavioral change for aggregators that have `aggregate_subnet_ids` outside their own validators' subnets — those subnet topics will be in the gossipsub subscription set but absent from `attestation_topics`.

If any downstream publish path (e.g. `publish_attestation`) looks up a topic by subnet ID from `attestation_topics` using a subnet that only appears in `aggregate_subnet_ids`, it will find no entry and either silently drop the publish or panic. Worth confirming this path is unreachable, or adding the aggregator's subscribed subnets to the cache as well.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/net/p2p/src/lib.rs
Line: 208-229

Comment:
**Fanout expiry risk for infrequently-publishing validators**

The comment states that non-aggregators "publish via gossipsub fanout." Fanout state in libp2p gossipsub is maintained per-topic and evicted after `fanout_ttl` (60 s here). If a non-aggregating validator restarts or is idle for more than 60 s before its first attestation, the node has no fanout peers for the subnet topic and the first `publish()` call may silently succeed locally but reach zero peers.

Since attestations arrive every slot (4 s) this is only a concern at startup or after a long idle period. Consider logging a warning when the first publish occurs while the node is not subscribed and has no fanout peers, or documenting the startup latency expectation.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: do not subscribe to subnets when no..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Codex Code Review

Findings

  1. High: this change weakens attestation gossip propagation by removing validator participation from the subnet mesh. In lib.rs, non-aggregators stop subscribing to attestation subnets, and in handler.rs their publishes now rely on gossipsub fanout only. That means validator-only nodes no longer maintain mesh peers or relay subnet traffic; they send to an ephemeral fanout set (fanout_ttl is 60s) instead of a stable mesh. For attestation subnets, that is a real correctness/performance regression risk: with few aggregators, unaggregated attestations become much easier to miss, which directly hurts aggregation quality and downstream fork-choice inputs. I would keep validator subnet subscriptions unless there is e2e evidence that fanout-only publication preserves delivery under realistic validator-heavy topologies.

  2. Low: the attestation subnet metric no longer reflects actual subscriptions for aggregator-only nodes. In lib.rs, metric_subnet is derived from attestation_topics, which now only contains validator publish topics. If aggregate_subnet_ids is set and validator_ids is empty, the node may subscribe to one or more explicit subnets but still report 0. That makes subnet/debug telemetry misleading. If this metric is intended to describe the subscribed subnet set, compute it from aggregate_subnets when is_aggregator is true.

No other material security, memory-safety, or SSZ/state-transition issues are touched by this diff.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Claude Code Review

Here is my review:


Review: fix: do not subscribe to subnets when not aggregating

Overall: The change is correct and well-motivated. The logic is properly decoupled and the comments explain the design clearly.


Correctness

Gossipsub fanout for publishing (lines 208–229): The core assumption — that non-aggregators can publish via gossipsub fanout without subscribing — is valid. Gossipsub maintains a fanout peer list per topic and routes publications through it. The fallback in publish_attestation (handler.rs:143, unwrap_or_else(|| attestation_subnet_topic(subnet_id))) is now exercised for non-aggregators, but attestation_topics is still populated for all validator subnets (lines 232–238), so get(&subnet_id) will hit the cache in the normal case.

attestation_topics scope change: Previously this map included explicit_ids subnets for aggregators; now it only covers validator_ids-derived subnets. This is more correct: attestation_topics is used for publishing (per your own validators), while explicit_ids are for receiving attestations from others. No regression here.

Metric computation (line 240): attestation_topics.keys().copied().min().unwrap_or(0) produces the same minimum subnet as before. Behaviour is unchanged.


Potential Concerns

Fanout TTL expiry for non-aggregators: fanout_ttl is configured at 60 seconds (lib.rs:111). If a non-aggregator's validators are assigned slots infrequently (e.g. small validator sets, long gaps between proposals), the fanout peer list for attestation subnets will expire between attestation publishes. On expiry, libp2p gossipsub re-selects fanout peers from its peer store. On a sparse or recently-started network, this peer store may be thin for the attestation subnet (since most nodes no longer subscribe to it), which could degrade attestation propagation. This isn't a bug in the PR, but worth noting as a network-health consideration at scale.

Division by zero is pre-existing: vid % config.attestation_committee_count (line 215, 234) panics if attestation_committee_count == 0. Not introduced here but worth tracking.


Minor

  • No log/trace for the "skipping subnet subscription" case when !config.is_aggregator. A single debug! at startup (e.g. "Not an aggregator, skipping attestation subnet subscriptions") would help operators confirm node mode. Not required.

  • or_insert_with (lines 235–237) correctly avoids building duplicate IdentTopic objects for validators sharing a subnet. Idiomatic.


Summary

The PR correctly separates the subscription path (aggregators only) from the topic cache (all validators), and the gossipsub fanout assumption holds. The one operational concern worth tracking is fanout TTL behaviour on sparse networks, but this does not block merging.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@pablodeymo pablodeymo merged commit ed2712a into main Apr 8, 2026
2 checks passed
@pablodeymo pablodeymo deleted the fix/non-aggregator-attestation-subscription branch April 8, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants