Skip to content

draft: attestation fuzz harness + malformed-input hardening#462

Open
liu971227-sys wants to merge 2 commits intoScottcjn:mainfrom
liu971227-sys:bounty/475-attestation-fuzz-hardening
Open

draft: attestation fuzz harness + malformed-input hardening#462
liu971227-sys wants to merge 2 commits intoScottcjn:mainfrom
liu971227-sys:bounty/475-attestation-fuzz-hardening

Conversation

@liu971227-sys
Copy link
Contributor

Summary

  • add attestation fuzz regression tests and replayable malformed-input corpus fixtures
  • harden /attest/submit to fail closed on non-object JSON and malformed nested payload shapes
  • normalize attestation device/signals/report fields so malformed inputs do not trigger unhandled exceptions
  • add consensus probe monitoring/tests included on this branch

Validation

  • python -m pytest tests/test_attestation_fuzz.py tests/test_consensus_probe.py
  • 11 tests passed locally

Bounty

@github-actions github-actions bot added documentation Improvements or additions to documentation BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) BCOS-L2 Beacon Certified Open Source tier BCOS-L2 (required for non-doc PRs) consensus Consensus/RIP-200 related node Node server related tests Test suite changes size/XL PR: 500+ lines labels Mar 1, 2026
@liu971227-sys
Copy link
Contributor Author

Draft PR updated.

Changes in this update:

  • narrowed the branch back down to the attestation fuzz / malformed-input hardening scope for bounty Add CODE_OF_CONDUCT.md (Bounty #510) #475
  • added a replayable corpus script: python tests/replay_attestation_corpus.py
  • added additional malformed corpus entries and documentation in docs/attestation_fuzzing.md
  • wired a 10,000-case attestation fuzz regression gate into CI

Validation completed locally:

  • python -m pytest tests/test_attestation_fuzz.py -v
  • python tests/replay_attestation_corpus.py tests/attestation_corpus/malformed_report_scalar.json
  • ATTEST_FUZZ_CASES=10000 python -m pytest tests/test_attestation_fuzz.py -k fuzz_no_unhandled_exceptions -v
  • Observed 10k run result: 1 passed in 250.90s

@liu971227-sys liu971227-sys force-pushed the bounty/475-attestation-fuzz-hardening branch from ea19c99 to 0c4213a Compare March 1, 2026 03:16
@github-actions github-actions bot added ci size/L PR: 201-500 lines labels Mar 1, 2026
@Scottcjn
Copy link
Owner

Scottcjn commented Mar 1, 2026

@liu971227-sys — Good start on the fuzz harness. 15 files, 614 additions across attestation hardening and consensus probes.

A few notes:

  1. The malformed-input hardening on /attest/submit is the highest-value piece here — we have seen garbage payloads from malfunctioning agents.
  2. When you mark this ready for review, include a sample fuzz run output showing the corpus coverage.
  3. Your wallet migration to RTCa320f4334e7500987bce2fa0475f089ae9cd90e3 has been processed (1,628.08 RTC).

Bounty will be assessed on merge based on coverage quality per #475.

Copy link

@larryjiang-star larryjiang-star left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Attestation Fuzz Harness + Hardening (PR #462)

Summary

This is a well-executed security-focused PR that adds fuzz testing infrastructure and input validation hardening for the /attest/submit endpoint.

Strengths

  1. Comprehensive Fuzz Infrastructure

    • Deterministic fuzz harness with replayable corpus (tests/attestation_corpus/)
    • Mutation-based fuzzing with 8 distinct mutation types covering common malformed inputs
    • Clear documentation in docs/attestation_fuzzing.md
  2. Input Validation Hardening

    • _attest_mapping(), _attest_text(), _attest_positive_int(), _attest_string_list() - clean utility functions
    • _normalize_attestation_device(), _normalize_attestation_signals(), _normalize_attestation_report() - proper normalization
    • Fixes type errors in validate_fingerprint_data(): checks isinstance before using dict methods
  3. Test Coverage

    • Parametrized corpus tests for known malformed inputs
    • Fuzz regression gate (10,000 cases in CI)
    • Proper DB setup/teardown in fixtures

Minor Suggestions

  1. Error Handling in Normalizers

    • The normalizers return empty/None defaults silently. Consider logging when receiving unexpected types for debugging/monitoring.
  2. Corpus Expansion

    • Consider adding edge cases: extremely long strings, unicode edge cases, nested depth limits
  3. CI Integration

    • The CI gate runs 10,000 cases (~250s). Consider parallelizing if this becomes a bottleneck.

Security Assessment

Approve - This PR significantly hardens the attestation endpoint against malformed input attacks. The fuzzing infrastructure allows continuous regression testing.

Risk: Low
Quality: High

Reviewer: larryjiang-openclaw (Miner ID)

@Scottcjn
Copy link
Owner

Scottcjn commented Mar 1, 2026

Great work on the attestation fuzz harness! The input hardening looks solid. However there are merge conflicts that need resolving — can you rebase on main? Once conflicts are resolved we'll merge this.

@Scottcjn
Copy link
Owner

Scottcjn commented Mar 1, 2026

@liu971227-sys this PR still has merge conflicts with main. Can you rebase onto the latest main branch? The fuzz harness work looks promising but we can't merge until the conflicts are resolved.

git fetch origin
git rebase origin/main
git push --force-with-lease

Copy link
Owner

@Scottcjn Scottcjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Attestation Fuzz Harness

Thanks for the contribution, liu971227-sys. The input normalization helpers added to the server code (_attest_mapping, _attest_text, _attest_positive_int, _attest_string_list, and the normalize* functions) are well-structured and genuinely useful hardening. The non-object JSON root guard (lines 1893-1900) is a solid fix. I appreciate the effort here.

However, the test harness itself has several issues that need to be addressed before this can be merged. A fuzz harness that gives false confidence is worse than no fuzz harness at all.


HIGH: Tests assert malformed inputs return ok: True

In test_attestation_fuzz.py line 145:

def test_attest_submit_corpus_cases_do_not_raise_server_errors(client, file_name):
    response = _post_raw_json(client, (CORPUS_DIR / file_name).read_text(encoding="utf-8"))
    assert response.status_code < 500
    assert response.get_json()["ok"] is True

This test sends deliberately malformed payloads (device as a scalar string, miner as an array, signals as a scalar, etc.) and then asserts the server accepts them with ok: True. A security fuzz test should verify that malformed inputs are rejected — not accepted. If these payloads currently return ok: True, that is the bug to fix, not the behavior to enshrine in CI.

The status_code < 500 assertion is reasonable (no crashes), but the ok is True assertion means this test suite will break if someone adds proper validation — it would actively block security improvements.

Suggested fix: Assert response.status_code in (400, 422) and response.get_json()["ok"] is False for payloads with type-mismatched fields. If the server currently accepts them, fix the server validation first (your normalization helpers already lay the groundwork for this).


HIGH: All security mechanisms are mocked out

The client fixture (lines 99-106) disables:

  • _check_hardware_binding → always returns (True, "ok", "")
  • check_ip_rate_limit → always returns (True, "ok")
  • HW_BINDING_V2False
  • HW_PROOF_AVAILABLEFalse
  • record_attestation_success → no-op
  • record_macs → no-op

This means the fuzz harness tests the input parsing path only, with all downstream security checks removed. The test name "fuzz harness" implies security testing, but what is actually tested is JSON deserialization robustness. That is not nothing, but it is also not what the PR description ("harden /attest/submit to fail closed") suggests.

More critically, validate_fingerprint_data is NOT mocked, but the mocked _check_hardware_binding bypass means fingerprint validation results have no downstream consequence in these tests. The fingerprint could fail validation and the test would still pass because hardware binding is bypassed.

Suggested fix: Add a second test fixture that keeps security checks enabled (at minimum fingerprint validation and hardware binding) and verify that malformed fingerprints cause attestation rejection. The existing fixture can remain for the "no 500 errors" crash-resistance tests.


MEDIUM: fingerprint field type guard is incomplete

The server-side fix at line 1107-1108 adds:

checks = fingerprint.get("checks", {})
if not isinstance(checks, dict):
    checks = {}

This is good, but validate_fingerprint_data still assumes its fingerprint parameter is a dict (line 1103: if not fingerprint or not isinstance(fingerprint, dict)). If a caller passes fingerprint as a list or string directly to this function from a different code path, it would crash on the .get() call at line 1103 before the isinstance check. The _attest_mapping() guard in submit_attestation protects this specific path, but validate_fingerprint_data as a standalone function remains fragile.

This is actually a real bug the fuzzer should have caught — sending "fingerprint": [1, 2, 3] would have triggered an AttributeError in the original code. The normalization fix addresses it indirectly, but a direct type guard at the top of validate_fingerprint_data would be more defensive:

if not isinstance(fingerprint, dict):
    return False, "fingerprint_not_dict"

MEDIUM: Deterministic fuzzer with no coverage feedback

The fuzzer uses random.Random(475) with a fixed seed and only 8 mutation types (lines 149-175). This produces the exact same sequence of inputs every run. This is regression testing, not fuzzing. The distinction matters because:

  1. Running 10,000 iterations of the same 8 mutations with the same seed produces only 8 distinct input shapes, repeated ~1,250 times each
  2. No coverage-guided mutation means the fuzzer cannot discover new code paths
  3. The PR docs call this a "10,000-case fuzz run" which overstates what it does

Suggested fix: Either rename this to "attestation input regression tests" (accurate) or integrate with a real coverage-guided fuzzer like Hypothesis or Atheris. At minimum, remove the fixed seed from the CI run so each run explores different combinations.


MEDIUM: Missing attack vectors

For a fuzz harness targeting a financial attestation endpoint, several important attack vectors are absent:

  • SQL injection in miner_id: e.g., "miner": "'; DROP TABLE balances; --" — the miner_id flows into SQLite queries
  • NaN/Infinity in numeric fields: "cores": float("inf") or "cores": float("nan") — these pass int() coercion differently across Python versions
  • Integer overflow: "cores": 99999999999999999999 — tests _attest_positive_int boundary behavior
  • Unicode normalization attacks: "miner": "fuzz\u200bminer" (zero-width space) — could bypass string matching while appearing identical
  • Oversized payloads: 10MB miner_id string to test memory/DoS behavior
  • Nested object depth: Deeply nested dicts/lists to test recursion limits
  • Empty string fields: "miner": "" vs "miner": " " vs "miner": "\t" — whitespace-only should be rejected

Summary

The server-side normalization code is solid work and should be preserved. The test harness needs rework:

  1. Fix assertions to verify malformed inputs are rejected, not accepted
  2. Add test fixtures with security checks enabled
  3. Add the missing attack vectors listed above
  4. Rename "fuzz" to "regression" or integrate actual coverage-guided fuzzing
  5. Add a direct type guard in validate_fingerprint_data

I would be happy to re-review once these issues are addressed. The normalization helpers are a genuine improvement to the codebase.

Copy link
Owner

@Scottcjn Scottcjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: APPROVED — Excellent security hardening work

This is exactly the kind of contribution we need. Specific highlights:

Input normalization (+78 lines): The _attest_mapping, _attest_text, _attest_positive_int, _attest_string_list helpers are well-designed fail-closed sanitizers. Every .get() call in the attestation path now goes through type-checked normalization.

Key fix: get_json(silent=True) + isinstance(data, dict) check prevents server crashes from null, array, or non-JSON payloads — this was a real attack surface.

CI integration: The 10,000-case fuzz gate in GitHub Actions is a strong regression safety net.

One issue: This PR has merge conflicts against main. Please rebase onto main and force-push to resolve. Once the conflicts are clear, this merges immediately.

Bounty: This qualifies for the security hardening bounty. Please reference the specific bounty issue number so we can process payment after merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) BCOS-L2 Beacon Certified Open Source tier BCOS-L2 (required for non-doc PRs) ci consensus Consensus/RIP-200 related documentation Improvements or additions to documentation node Node server related size/L PR: 201-500 lines size/XL PR: 500+ lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants