Skip to content

synth: hash post-read_design_sources state to isolate slang non-determinism#4236

Closed
oharboe wants to merge 2 commits into
The-OpenROAD-Project:masterfrom
oharboe:post-slang-state-hash
Closed

synth: hash post-read_design_sources state to isolate slang non-determinism#4236
oharboe wants to merge 2 commits into
The-OpenROAD-Project:masterfrom
oharboe:post-slang-state-hash

Conversation

@oharboe

@oharboe oharboe commented May 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

yosys-slang was recently found to be non-idempotent in a few cases:
re-running synth on the same RTL produces different post-frontend
RTLIL. The two file-based hashes already on master only bracket the
whole synthesis pipeline:

  • `synth__canonical_netlist__hash` — sha1 of `1_1_yosys_canonicalize.rtlil`,
    captured after `read_design_sources` → `hierarchy -check` →
    `opt_clean -purge`.
  • `synth__netlist__hash` — sha1 of `1_2_yosys.v`, captured after ABC.

So a slang-induced diff collapses into `canonical_netlist__hash`
alongside hierarchy / opt_clean drift, with no way to tell them apart.

This PR adds one more hash, `synth__post_read_sources__hash`, captured
immediately after `read_design_sources` returns — i.e. the state
straight out of the HDL frontend (slang / verific / builtin
`read_verilog`), before any other pass runs. An unstable hash here
means the frontend itself is non-idempotent, distinct from drift
introduced later.

Mechanics

  • `flow/scripts/synth_preamble.tcl` gains a `write_state_hash`
    proc that strips `src` attributes (file:line metadata — so the hash
    is path-independent across the bazel sandbox and the classic-make
    build), dumps the current RTLIL to a temp file under
    `$OBJECTS_DIR`, sha1's it, deletes the temp, and emits a
    `: ` line to the yosys log.
  • `flow/scripts/synth_canonicalize.tcl` calls
    `write_state_hash synth__post_read_sources__hash` immediately after
    `read_design_sources`.
  • `flow/util/genMetrics.py` extracts the new hash from
    `1_1_yosys_canonicalize.log` into `metadata.json`.
  • `flow/util/genRuleFile.py` declares a `level=warning`,
    `compare="=="` literal rule for the new metric so a downstream
    `rules-base.json` can pin it without failing the build on a mismatch
    (matching how `canonical_netlist__hash` and `netlist__hash` are
    declared).

No design's `rules-base.json` gets a baseline value pinned in this PR;
declaring the rule type just makes the metric available to any future
per-design pin.

Pre-ABC / post-hierarchy / post-synth-main hashes are intentionally
left out — follow-up PRs if the slang-only signal isn't enough.

Test plan

  • CI Tclint + lint (black) — both already exercised locally on the
    branch and pass.
  • Manual: `make DESIGN_CONFIG=./designs/asap7/uart/config.mk` (or
    equivalent bazel target), then `jq .synth__post_read_sources__hash
    reports/asap7/uart/base/metadata.json` — the field should contain
    a 40-char sha1. Re-run on the same input twice: same sha1 if the
    frontend is idempotent for that design, different sha1 if not —
    i.e. the exact distinction this PR exists to surface.

🤖 Generated with Claude Code

@oharboe

oharboe commented May 18, 2026

Copy link
Copy Markdown
Collaborator Author

This doesn't add much synthesis time (one extra write_rtlil of the same RTLIL we'd otherwise be writing 15 lines later in synth_canonicalize.tcl, plus a sha1sum of it) and for now we need this information to track yosys/slang idempotency.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to fingerprint the design state immediately after the HDL frontend stage to detect non-idempotent behavior. It adds a write_state_hash procedure in the synthesis preamble and updates metric extraction and rule generation scripts to track this new hash. Feedback was provided to use design -push and design -pop within the hashing procedure to prevent the permanent removal of source-level debug attributes, which are currently stripped to ensure path-independent hashes.

Comment thread flow/scripts/synth_preamble.tcl
@oharboe oharboe force-pushed the post-slang-state-hash branch from 850b2f3 to 18499f9 Compare May 18, 2026 05:00
…erminism

yosys-slang was recently found to be non-idempotent in a few cases:
re-running synth on the same RTL produces different post-frontend
RTLIL.  Master already has two file-based hashes that bracket the
whole synthesis pipeline:

  synth__canonical_netlist__hash   sha1 of `1_1_yosys_canonicalize.rtlil`
                                   (captured AFTER `read_design_sources`
                                   -> `hierarchy -check` -> `opt_clean
                                   -purge`).
  synth__netlist__hash             sha1 of `1_2_yosys.v`
                                   (captured AFTER ABC).

A slang-induced diff therefore collapses into `canonical_netlist__hash`
alongside any hierarchy / opt_clean drift, with no way to tell them
apart.

Add one more hash, `synth__post_read_sources__hash`, captured
immediately after `read_design_sources` returns -- i.e. the state
straight out of the HDL frontend, before any other pass runs.  An
unstable hash here means the frontend itself is non-idempotent,
distinct from drift introduced later.

Mechanics:

  * `flow/scripts/synth_preamble.tcl` gains a `write_state_hash`
    proc that strips `src` attributes (file:line metadata, so the
    hash is path-independent across the bazel sandbox and the
    classic-make build), dumps the current RTLIL to a temp file
    under `$OBJECTS_DIR`, sha1's it, deletes the temp, and emits
    a `<metric>: <sha>` line to the yosys log.
  * `flow/scripts/synth_canonicalize.tcl` calls
    `write_state_hash synth__post_read_sources__hash` immediately
    after `read_design_sources`.
  * `flow/util/genMetrics.py` extracts the new hash from
    `1_1_yosys_canonicalize.log` into `metadata.json`.
  * `flow/util/genRuleFile.py` declares a `level=warning`,
    `compare="=="` literal rule for the new metric so a downstream
    `rules-base.json` can pin it without failing the build on a
    mismatch (matching how `canonical_netlist__hash` and
    `netlist__hash` are declared).

No design's `rules-base.json` gets a baseline value pinned in this
PR; declaring the rule type just makes the metric available to any
future per-design pin.

Pre-ABC / post-hierarchy / post-synth-main hashes are intentionally
left out -- they belong in follow-up PRs if the slang-only signal
isn't enough.

Signed-off-by: Øyvind Harboe <oyvind@ascenium.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
@oharboe oharboe force-pushed the post-slang-state-hash branch from 18499f9 to fffb07a Compare May 18, 2026 05:02
@oharboe

oharboe commented May 18, 2026

Copy link
Copy Markdown
Collaborator Author

/gemini review

@oharboe oharboe requested a review from maliberty May 18, 2026 05:27

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a fingerprinting mechanism to capture the design state immediately after the HDL frontend stage, enabling the detection of non-idempotent behavior. It introduces a write_state_hash procedure to strip path-dependent attributes and generate a SHA1 hash, which is then integrated into the metrics extraction and rule validation scripts. Feedback indicates that the attribute stripping should use a global selection to ensure all modules are processed and that the log extraction regex should be adjusted to handle potential leading whitespace.

Comment thread flow/scripts/synth_preamble.tcl Outdated
Comment thread flow/util/genMetrics.py Outdated
Two follow-ups from gemini's second review pass on The-OpenROAD-Project#4236:

  * `setattr -unset src *` in `write_state_hash` only strips src
    attrs from objects in the *current* module.  `write_state_hash`
    runs before `hierarchy -check -top` in synth_canonicalize.tcl,
    so the post-frontend design has many separate modules whose
    src attrs (= file:line metadata) survive the strip and end up
    in the hashed RTLIL, breaking path-independence between the
    bazel sandbox and the classic-make build.  Use `*/*` which
    selects across all modules.

  * The genMetrics.py regex for the new hash was anchored to a
    bare `^`, while other extractors in the same file allow optional
    leading whitespace.  Add `\s*` after `^` so the extraction is
    robust to any log wrapper that inserts indentation.

Signed-off-by: Øyvind Harboe <oyvind@ascenium.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
@oharboe

oharboe commented May 18, 2026

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements design state fingerprinting immediately after the HDL frontend to detect non-idempotency in the synthesis pipeline. It introduces a write_state_hash procedure in synth_preamble.tcl and updates the metrics and rule generation scripts to track the new synth__post_read_sources__hash metric. Feedback was provided to enhance the write_state_hash procedure by adding error handling to protect the design stack and stripping additional attributes to ensure path-independence.

Comment on lines +21 to +36
design -push
# `*/*` strips src attrs across objects in *all* modules. Bare `*`
# only targets the current module's objects, which is fine after
# `hierarchy -check -top` collapses things to one design, but
# `write_state_hash` is called before `hierarchy` here, so the
# post-frontend state may have many separate modules whose src
# attrs would otherwise survive into the hashed RTLIL and break
# path-independence.
setattr -unset src */*
setattr -mod -unset src *
set tmp $::env(OBJECTS_DIR)/.${metric}.tmp.rtlil
write_rtlil $tmp
design -pop
set sha [lindex [split [exec sha1sum $tmp]] 0]
file delete $tmp
puts "${metric}: $sha"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The write_state_hash procedure lacks error handling, which could leave the Yosys design stack in an inconsistent state (due to a missing design -pop) if an error occurs during attribute stripping, file writing, or hashing (e.g., if sha1sum is missing or the disk is full). Wrapping the core logic in a catch block ensures that the design state is restored and the temporary file is cleaned up. Additionally, stripping the orig_src and syn_file attributes is recommended to ensure path-independence, as some frontends (like Verific) use them to store absolute paths.

  design -push
  set tmp $::env(OBJECTS_DIR)/.${metric}.tmp.rtlil
  if {[catch {
    foreach attr {src orig_src syn_file} {
      setattr -unset $attr */*
      setattr -mod -unset $attr *
    }
    write_rtlil $tmp
    set sha [lindex [exec sha1sum $tmp] 0]
    puts "${metric}: $sha"
  } err]} {
    puts "Warning: ${metric} calculation failed: $err"
  }
  design -pop
  file delete -force $tmp

@oharboe

oharboe commented May 18, 2026

Copy link
Copy Markdown
Collaborator Author

@povik @maliberty I tried a couple of times to get Claude to fix the Gemini nits, but I don't trust myself, nor Gemini when it comes to yosys scripts... The Claude generated code I tested, so at least I know it worked on my machine.

My brain refuses to learn Yosys selection DSL syntax...

This PR was intended to issue a warning if there's an idempotency in slang, verific or verilog canonicalization, which I think could be helpful.

If this PR is useful, feel free to re-open and finish it.

@oharboe oharboe closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant