synth: hash post-read_design_sources state to isolate slang non-determinism#4236
synth: hash post-read_design_sources state to isolate slang non-determinism#4236oharboe wants to merge 2 commits into
read_design_sources state to isolate slang non-determinism#4236Conversation
|
This doesn't add much synthesis time (one extra |
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to fingerprint the design state immediately after the HDL frontend stage to detect non-idempotent behavior. It adds a write_state_hash procedure in the synthesis preamble and updates metric extraction and rule generation scripts to track this new hash. Feedback was provided to use design -push and design -pop within the hashing procedure to prevent the permanent removal of source-level debug attributes, which are currently stripped to ensure path-independent hashes.
850b2f3 to
18499f9
Compare
…erminism
yosys-slang was recently found to be non-idempotent in a few cases:
re-running synth on the same RTL produces different post-frontend
RTLIL. Master already has two file-based hashes that bracket the
whole synthesis pipeline:
synth__canonical_netlist__hash sha1 of `1_1_yosys_canonicalize.rtlil`
(captured AFTER `read_design_sources`
-> `hierarchy -check` -> `opt_clean
-purge`).
synth__netlist__hash sha1 of `1_2_yosys.v`
(captured AFTER ABC).
A slang-induced diff therefore collapses into `canonical_netlist__hash`
alongside any hierarchy / opt_clean drift, with no way to tell them
apart.
Add one more hash, `synth__post_read_sources__hash`, captured
immediately after `read_design_sources` returns -- i.e. the state
straight out of the HDL frontend, before any other pass runs. An
unstable hash here means the frontend itself is non-idempotent,
distinct from drift introduced later.
Mechanics:
* `flow/scripts/synth_preamble.tcl` gains a `write_state_hash`
proc that strips `src` attributes (file:line metadata, so the
hash is path-independent across the bazel sandbox and the
classic-make build), dumps the current RTLIL to a temp file
under `$OBJECTS_DIR`, sha1's it, deletes the temp, and emits
a `<metric>: <sha>` line to the yosys log.
* `flow/scripts/synth_canonicalize.tcl` calls
`write_state_hash synth__post_read_sources__hash` immediately
after `read_design_sources`.
* `flow/util/genMetrics.py` extracts the new hash from
`1_1_yosys_canonicalize.log` into `metadata.json`.
* `flow/util/genRuleFile.py` declares a `level=warning`,
`compare="=="` literal rule for the new metric so a downstream
`rules-base.json` can pin it without failing the build on a
mismatch (matching how `canonical_netlist__hash` and
`netlist__hash` are declared).
No design's `rules-base.json` gets a baseline value pinned in this
PR; declaring the rule type just makes the metric available to any
future per-design pin.
Pre-ABC / post-hierarchy / post-synth-main hashes are intentionally
left out -- they belong in follow-up PRs if the slang-only signal
isn't enough.
Signed-off-by: Øyvind Harboe <oyvind@ascenium.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
18499f9 to
fffb07a
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements a fingerprinting mechanism to capture the design state immediately after the HDL frontend stage, enabling the detection of non-idempotent behavior. It introduces a write_state_hash procedure to strip path-dependent attributes and generate a SHA1 hash, which is then integrated into the metrics extraction and rule validation scripts. Feedback indicates that the attribute stripping should use a global selection to ensure all modules are processed and that the log extraction regex should be adjusted to handle potential leading whitespace.
Two follow-ups from gemini's second review pass on The-OpenROAD-Project#4236: * `setattr -unset src *` in `write_state_hash` only strips src attrs from objects in the *current* module. `write_state_hash` runs before `hierarchy -check -top` in synth_canonicalize.tcl, so the post-frontend design has many separate modules whose src attrs (= file:line metadata) survive the strip and end up in the hashed RTLIL, breaking path-independence between the bazel sandbox and the classic-make build. Use `*/*` which selects across all modules. * The genMetrics.py regex for the new hash was anchored to a bare `^`, while other extractors in the same file allow optional leading whitespace. Add `\s*` after `^` so the extraction is robust to any log wrapper that inserts indentation. Signed-off-by: Øyvind Harboe <oyvind@ascenium.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements design state fingerprinting immediately after the HDL frontend to detect non-idempotency in the synthesis pipeline. It introduces a write_state_hash procedure in synth_preamble.tcl and updates the metrics and rule generation scripts to track the new synth__post_read_sources__hash metric. Feedback was provided to enhance the write_state_hash procedure by adding error handling to protect the design stack and stripping additional attributes to ensure path-independence.
| design -push | ||
| # `*/*` strips src attrs across objects in *all* modules. Bare `*` | ||
| # only targets the current module's objects, which is fine after | ||
| # `hierarchy -check -top` collapses things to one design, but | ||
| # `write_state_hash` is called before `hierarchy` here, so the | ||
| # post-frontend state may have many separate modules whose src | ||
| # attrs would otherwise survive into the hashed RTLIL and break | ||
| # path-independence. | ||
| setattr -unset src */* | ||
| setattr -mod -unset src * | ||
| set tmp $::env(OBJECTS_DIR)/.${metric}.tmp.rtlil | ||
| write_rtlil $tmp | ||
| design -pop | ||
| set sha [lindex [split [exec sha1sum $tmp]] 0] | ||
| file delete $tmp | ||
| puts "${metric}: $sha" |
There was a problem hiding this comment.
The write_state_hash procedure lacks error handling, which could leave the Yosys design stack in an inconsistent state (due to a missing design -pop) if an error occurs during attribute stripping, file writing, or hashing (e.g., if sha1sum is missing or the disk is full). Wrapping the core logic in a catch block ensures that the design state is restored and the temporary file is cleaned up. Additionally, stripping the orig_src and syn_file attributes is recommended to ensure path-independence, as some frontends (like Verific) use them to store absolute paths.
design -push
set tmp $::env(OBJECTS_DIR)/.${metric}.tmp.rtlil
if {[catch {
foreach attr {src orig_src syn_file} {
setattr -unset $attr */*
setattr -mod -unset $attr *
}
write_rtlil $tmp
set sha [lindex [exec sha1sum $tmp] 0]
puts "${metric}: $sha"
} err]} {
puts "Warning: ${metric} calculation failed: $err"
}
design -pop
file delete -force $tmp
|
@povik @maliberty I tried a couple of times to get Claude to fix the Gemini nits, but I don't trust myself, nor Gemini when it comes to yosys scripts... The Claude generated code I tested, so at least I know it worked on my machine. My brain refuses to learn Yosys selection DSL syntax... This PR was intended to issue a warning if there's an idempotency in slang, verific or verilog canonicalization, which I think could be helpful. If this PR is useful, feel free to re-open and finish it. |
Summary
yosys-slang was recently found to be non-idempotent in a few cases:
re-running synth on the same RTL produces different post-frontend
RTLIL. The two file-based hashes already on master only bracket the
whole synthesis pipeline:
captured after `read_design_sources` → `hierarchy -check` →
`opt_clean -purge`.
So a slang-induced diff collapses into `canonical_netlist__hash`
alongside hierarchy / opt_clean drift, with no way to tell them apart.
This PR adds one more hash, `synth__post_read_sources__hash`, captured
immediately after `read_design_sources` returns — i.e. the state
straight out of the HDL frontend (slang / verific / builtin
`read_verilog`), before any other pass runs. An unstable hash here
means the frontend itself is non-idempotent, distinct from drift
introduced later.
Mechanics
proc that strips `src` attributes (file:line metadata — so the hash
is path-independent across the bazel sandbox and the classic-make
build), dumps the current RTLIL to a temp file under
`$OBJECTS_DIR`, sha1's it, deletes the temp, and emits a
`: ` line to the yosys log.
`write_state_hash synth__post_read_sources__hash` immediately after
`read_design_sources`.
`1_1_yosys_canonicalize.log` into `metadata.json`.
`compare="=="` literal rule for the new metric so a downstream
`rules-base.json` can pin it without failing the build on a mismatch
(matching how `canonical_netlist__hash` and `netlist__hash` are
declared).
No design's `rules-base.json` gets a baseline value pinned in this PR;
declaring the rule type just makes the metric available to any future
per-design pin.
Pre-ABC / post-hierarchy / post-synth-main hashes are intentionally
left out — follow-up PRs if the slang-only signal isn't enough.
Test plan
branch and pass.
equivalent bazel target), then `jq .synth__post_read_sources__hash
reports/asap7/uart/base/metadata.json` — the field should contain
a 40-char sha1. Re-run on the same input twice: same sha1 if the
frontend is idempotent for that design, different sha1 if not —
i.e. the exact distinction this PR exists to surface.
🤖 Generated with Claude Code