Implement Feistel-based permutation for LHS and convert samplers to stateless RNG#32833
Implement Feistel-based permutation for LHS and convert samplers to stateless RNG#32833zachmprince wants to merge 20 commits intoidaholab:nextfrom
Conversation
This works and is consistent in parallel, but fails most tests due to algorithmic change. Integration with MooseRandomStateless will probably require more regolding, so I'm saving this part for later. Refs idaholab#32194
LHS quality experiment: Feistel permutation vs. Fisher-Yates shuffle
Test function. Each trial draws an where Parameters. 4 samplers x 4 dimensions (4, 8, 16, 32) x 10 sample counts Result. Across all 240
Both LHS variants produce the correct |
d0d56ce to
af7f601
Compare
|
Job Precheck, step Python: black format on af7f601 wanted to post the following: Python black formattingYour code requires style changes. A patch was generated and copied here. You can directly apply the patch by running the following at the top level of your repository: Alternatively, you can run the following at the top level of your repository: |
|
Job Precheck, step Clang format on f298501 wanted to post the following: Your code requires style changes. A patch was auto generated and copied here
Alternatively, with your repository up to date and in the top level of your repository:
|
5ba8c63 to
9bff5a0
Compare
Trivial regolding was required on timestep 1 due to previous inconsistency in what the decision reporters were outputting as inputs. Refs idaholab#32194
…makes more sense now idaholab#32194
9bff5a0 to
d498eb1
Compare
|
Job Documentation, step Docs: sync website on 8af031b wanted to post the following: View the site here This comment will be updated on new commits. |
|
Job Coverage, step Generate coverage on 8af031b wanted to post the following: Framework coverage
Modules coverageStochastic tools
Full coverage reportsReports
This comment will be updated on new commits. |
||||||||||||||||||||||||||||||||||||||||||||||||||||
…memory stays reasonable idaholab#32194
16a857a to
87c06da
Compare
grmnptr
left a comment
There was a problem hiding this comment.
Very nice! Just a few questions.
| * the result falls within [0, n). | ||
| * | ||
| * @param x Input index; must satisfy x < n. | ||
| * @return A unique index in [0, n). Calling permute for every x in [0, n) |
There was a problem hiding this comment.
So doing this multiple times for avoiding the padding is just equivalent to doing more than 8 rounds? So more expensive but still defendable? Could it introduce distortions close to the padding depending on the round function? Or once well-mixed, it doesn't matter?
There was a problem hiding this comment.
Here is the response from claude:
Cycle-walking is not equivalent to more rounds. More rounds deepen the mixing within a single application of the Feistel network — they improve avalanche diffusion so that each call produces a better-scrambled output. Cycle-walking instead applies the complete 8-round network repeatedly to different intermediate values: permutePadded(permutePadded(x)) runs 8 rounds on x to get y, then 8 fresh rounds on y. These are unrelated computations. So yes, it costs more, but it isn't buying you better mixing of the original input — it's purely a correctness mechanism.
No distortions near the padding boundary. This is the key point. Because permutePadded is a bijection on the full padded domain [0, 2^(2*half_bits)), cycle-walking is equivalent to: take the permutation of the padded domain, discard every element that lands in the padding region [n, 2^(2*half_bits)), and keep only those that land in [0, n). The bijectivity guarantees every value in [0, n) is reached exactly once as you iterate over all starting points in [0, n) — you get a valid, unbiased permutation of [0, n). There is no correlation between proximity to the padding boundary and the distribution of outputs, because the round function's mixing (mix32 + key mixing + round-dependent constant) ensures no structure survives from the padded arithmetic.
Expected cost is bounded. The padded size is the smallest 2^(2k) >= n, so it's at most 4n. The expected number of cycle-walk iterations is padded_size / n <= 4, usually much less. For LHS with reasonable n, the overhead is negligible.
| } | ||
|
|
||
| uint32_t | ||
| MooseRandomPerturbation::roundFunction(uint32_t half, unsigned int round) const |
There was a problem hiding this comment.
I mean I suppose the function itself doesnt matter too much, but this form must have come from somewhere and there must be a reason why this is used over everything else.
There was a problem hiding this comment.
Apparently, this is a standard function to use:
0x9e3779b9U is the 32-bit fractional part of the golden ratio: floor((φ−1) × 2³²). It appears in Knuth's multiplicative hash, xxHash, TEA/XTEA, and many others as a "nothing up my sleeve" constant with provably good bit distribution. Multiplying by (round + 1) makes the additive constant distinct each round — without this, pairs of rounds could cancel algebraically (a slide attack), so you'd get no net mixing from repeated even rounds.
mix32 is the degski/Murmur3-finalizer avalanche hash. The specific constants 0x7feb352dU and 0x846ca68bU were found by automated search (Pelle Evensen's work) to maximize the strict avalanche criterion — roughly, each input bit flips ~half the output bits. The xor-shift → multiply → xor-shift → multiply → xor-shift pattern is the standard way to build a bijective 32-bit finalizer; bijective matters here because it means mix32 contributes no collisions.
The overall structure — x ^= k0; x += constant*(round+1); x ^= k1; x = mix32(x) — is a lightweight keyed hash construction similar to a TEA round: key mixing, round diversification, key mixing again, then avalanche. It doesn't need to be cryptographically strong (this is for LHS, not encryption); it needs to be fast, key-dependent so different seeds give independent permutations, round-dependent so the Feistel network doesn't degenerate, and well-avalanching so the half-block outputs look uniform. This construction satisfies all four.
- Removing redundant half mask for left split integer - Removing forgotten line in PCMCBase Refs #327194, idaholab#32775
|
Job Test, step Results summary on 8af031b wanted to post the following: Framework test summaryCompared against 5b391df in job civet.inl.gov/job/3787211. No added testsRun time changes
Modules test summaryCompared against 5b391df in job civet.inl.gov/job/3787211. No added testsRun time changes
|

Closes #32775, refs #32194
Reason
Latin Hypercube Sampling (LHS) previously relied on a stateful
shuffle()callinside
sampleSetUp()to permute bin assignments. This coupling between thegenerator state-machine and the sample-matrix loop made LHS incompatible with
the stateless RNG direction, prevented parallel or out-of-order sample access,
and required save/restore of generator state around every call to
sampleSetUp/sampleTearDown. The same stateful pattern was shared by theMCMC and active-learning samplers, making the entire Sampler hierarchy harder
to reason about and test.
Design
MooseRandomPerturbation(new framework utility)A header-only class implementing a keyed pseudo-random permutation of the
integers
[0, n)using a balanced Feistel network:half-block with both subkeys and a golden-ratio-derived constant, then applies
the Murmur3/degski avalanche hash for bit diffusion.
nneed not be a power of two the network operates on the smallestpadded domain
2^(2*half_bits) >= nand uses cycle-walking to rejectout-of-range outputs.
invertible (
invert(permute(x)) == x).unit/src/MooseRandomPerturbationTest.C(bijection, invertibility, bit-width range, seed uniqueness, reproducibility).
Redesigned
LatinHypercubeSamplerThe sampler now uses two stateless generators:
MooseRandomPerturbationpermuter per column.In
computeSampleRow(row, col)the bin for samplerowin columncolispermuter[col].permute(row), so the full LHS sample matrix is determinedentirely by the two generator seeds. No state needs to be saved, restored, or
advanced around a setup callback. Permuters are initialised once in
executeTearDown()after the generator has been advanced to the correct offset.Sampler base-class cleanup
sampleSetUp()andsampleTearDown()(per-row callbacks that ran inside thesample-matrix loop and required generator save/restore) have been removed. The
simpler
executeSetUp()/executeTearDown()pair (called once before/afterthe entire
execute()) is sufficient for all remaining use-cases.The
CommMethodenum,shuffle()template, andsaveGeneratorState()/restoreGeneratorState()methods have been removed along with it.MCMC and active-learning samplers
PMCMCBaseand every derived class that previously relied onsampleSetUptoseed proposals now use
executeSetUp()instead. TheproposeSamples()interfaceno longer takes a
seed_valueargument; three new helper methods (random(),randomIndex(),randomIndexPair()) encapsulate stateless index-based draws soeach sampler can advance its own
_rand_indexcounter without touching generatorstate directly.
AdaptiveImportanceSamplerhad an off-by-one in its sample index that is alsocorrected here.
Stateless conversion for remaining samplers
MorrisSamplerandNestedMonteCarloSamplerwere also converted from thestateful
shuffle()/sampleSetUp()pattern to the stateless generator API.Impact
Framework API changes
Sampler::sampleSetUp(SampleMode)Sampler::executeSetUp()Sampler::sampleTearDown(SampleMode)Sampler::executeTearDown()Sampler::shuffle(begin, end, generator_index)MooseRandomPerturbationSampler::saveGeneratorState()/restoreGeneratorState()Sampler::CommMethodenum (LOCAL, SEMI_LOCAL, NONE)PMCMCBase::proposeSamples(seed_value, ...)proposeSamples(...)(no seed arg)New framework utility:
framework/include/utils/MooseRandomPerturbation.h(header-only, no new
.Cfile, no registration macro required).Gold files for all LHS tests, several MCMC tests, and surrogate-training tests
that pass through an LHS sampler have been regolded because the new Feistel
permutation and stateless generator advancement produce a different (but equally
valid and reproducible) sample sequence.
To-Do
Things to do once reviewers are satisfied with changes.
Update figures, tables, etc. in documentation regarding LHS sampling changes:
modules/stochastic_tools/examples/parameter_study.mdmodules/stochastic_tools/examples/nonlin_parameter_study.mdmodules/stochastic_tools/examples/sobol.mdmodules/stochastic_tools/examples/poly_regression_surrogate.mdmodules/stochastic_tools/examples/pod_rb_surrogate.mdmodules/stochastic_tools/examples/combined_example_2d_trans_diff.mdmodules/stochastic_tools/examples/cross_validation.mdmodules/combined/examples/stm_thermomechanics.mdUpdate applications: