Implement Feistel-based permutation for LHS and convert samplers to stateless RNG by zachmprince · Pull Request #32833 · idaholab/moose

zachmprince · 2026-04-24T18:57:34Z

Reason

Latin Hypercube Sampling (LHS) previously relied on a stateful shuffle() call
inside sampleSetUp() to permute bin assignments. This coupling between the
generator state-machine and the sample-matrix loop made LHS incompatible with
the stateless RNG direction, prevented parallel or out-of-order sample access,
and required save/restore of generator state around every call to
sampleSetUp/sampleTearDown. The same stateful pattern was shared by the
MCMC and active-learning samplers, making the entire Sampler hierarchy harder
to reason about and test.

Design

`MooseRandomPerturbation` (new framework utility)

A header-only class implementing a keyed pseudo-random permutation of the
integers [0, n) using a balanced Feistel network:

The 64-bit seed is split into two 32-bit subkeys; each round mixes the
half-block with both subkeys and a golden-ratio-derived constant, then applies
the Murmur3/degski avalanche hash for bit diffusion.
Because n need not be a power of two the network operates on the smallest
padded domain 2^(2*half_bits) >= n and uses cycle-walking to reject
out-of-range outputs.
The permutation is bijective (every input maps to a unique output) and
invertible (invert(permute(x)) == x).
Verified by five new unit tests in unit/src/MooseRandomPerturbationTest.C
(bijection, invertibility, bit-width range, seed uniqueness, reproducibility).

Redesigned `LatinHypercubeSampler`

The sampler now uses two stateless generators:

Generator 0 — draws the uniform point inside the selected bin.
Generator 1 — seeds one MooseRandomPerturbation permuter per column.

In computeSampleRow(row, col) the bin for sample row in column col is
permuter[col].permute(row), so the full LHS sample matrix is determined
entirely by the two generator seeds. No state needs to be saved, restored, or
advanced around a setup callback. Permuters are initialised once in
executeTearDown() after the generator has been advanced to the correct offset.

Sampler base-class cleanup

sampleSetUp() and sampleTearDown() (per-row callbacks that ran inside the
sample-matrix loop and required generator save/restore) have been removed. The
simpler executeSetUp() / executeTearDown() pair (called once before/after
the entire execute()) is sufficient for all remaining use-cases.

The CommMethod enum, shuffle() template, and saveGeneratorState() /
restoreGeneratorState() methods have been removed along with it.

MCMC and active-learning samplers

PMCMCBase and every derived class that previously relied on sampleSetUp to
seed proposals now use executeSetUp() instead. The proposeSamples() interface
no longer takes a seed_value argument; three new helper methods (random(),
randomIndex(), randomIndexPair()) encapsulate stateless index-based draws so
each sampler can advance its own _rand_index counter without touching generator
state directly.

AdaptiveImportanceSampler had an off-by-one in its sample index that is also
corrected here.

Stateless conversion for remaining samplers

MorrisSampler and NestedMonteCarloSampler were also converted from the
stateful shuffle()/sampleSetUp() pattern to the stateless generator API.

Impact

Framework API changes

Removed	Replacement
`Sampler::sampleSetUp(SampleMode)`	`Sampler::executeSetUp()`
`Sampler::sampleTearDown(SampleMode)`	`Sampler::executeTearDown()`
`Sampler::shuffle(begin, end, generator_index)`	`MooseRandomPerturbation`
`Sampler::saveGeneratorState()` / `restoreGeneratorState()`	not needed with stateless RNG
`Sampler::CommMethod` enum (LOCAL, SEMI_LOCAL, NONE)	removed
`PMCMCBase::proposeSamples(seed_value, ...)`	`proposeSamples(...)` (no seed arg)

New framework utility: framework/include/utils/MooseRandomPerturbation.h
(header-only, no new .C file, no registration macro required).

Gold files for all LHS tests, several MCMC tests, and surrogate-training tests
that pass through an LHS sampler have been regolded because the new Feistel
permutation and stateless generator advancement produce a different (but equally
valid and reproducible) sample sequence.

To-Do

Things to do once reviewers are satisfied with changes.

Update figures, tables, etc. in documentation regarding LHS sampling changes:

modules/stochastic_tools/examples/parameter_study.md
modules/stochastic_tools/examples/nonlin_parameter_study.md
modules/stochastic_tools/examples/sobol.md
modules/stochastic_tools/examples/poly_regression_surrogate.md
modules/stochastic_tools/examples/pod_rb_surrogate.md
modules/stochastic_tools/examples/combined_example_2d_trans_diff.md
modules/stochastic_tools/examples/cross_validation.md
modules/combined/examples/stm_thermomechanics.md

Update applications:

Grizzly

…lab#32194

This works and is consistent in parallel, but fails most tests due to algorithmic change. Integration with MooseRandomStateless will probably require more regolding, so I'm saving this part for later. Refs idaholab#32194

zachmprince · 2026-04-24T19:14:34Z

LHS quality experiment: Feistel permutation vs. Fisher-Yates shuffle

modules/stochastic_tools/examples/lhs/lhs_experiment.{py,ipynb} contains a
numerical comparison of four samplers run against a parametric test function
family.

Test function. Each trial draws an nrow x ndim sample matrix and evaluates

$$ f(\mathbf{x}; \lambda) = \sum_c \frac{e^{x_c}}{2^{c+1}} + \lambda \cdot \frac{\sum_{{i}<{j}} L_2(x_i), L_2(x_j)}{\sqrt{d(d-1)/2}} $$

where $L_2(x) = \sqrt{12}(x - 0.5)$ is the centered Legendre polynomial. The
additive term ($\lambda = 0$) is well-suited to LHS; larger $\lambda$ values add
pairwise interaction that LHS cannot fully stratify, stress-testing each
sampler's behavior when the function is less cooperative. The exact mean of $f$
over $[0,1]^d$ is known analytically, so RMSE is computed directly.

Parameters. 4 samplers x 4 dimensions (4, 8, 16, 32) x 10 sample counts
(8 -- 4096, log-spaced) x 6 $\lambda$ values (0, 0.1, 0.3, 1, 3, 10) x 100
independent trials = 96 000 runs. The Feistel LHS calls the actual
stochastic_tools-opt binary; the others use scipy.stats.qmc.

Result. Across all 240 $(\lambda, d, N)$ combinations the ratio of Feistel
RMSE to Fisher-Yates RMSE has a mean of 0.99 and a standard deviation of
0.10, with a range of 0.81 -- 1.28. The extreme ratios appear only at the
smallest sample counts (nrow = 8) and highest dimensions (ndim = 32), where
both methods have large absolute RMSE and the ratio is dominated by Monte Carlo
noise across the 100 trials. At $N \ge 32$ the two LHS variants are
indistinguishable. The table below shows the $\lambda = 0$, $d = 8$ slice as a
representative example:

nrow	Monte Carlo	LHS (Fisher-Yates)	LHS (Feistel)	Sobol
8	0.110472	0.012540	0.013309	0.011746
16	0.064504	0.004484	0.004344	0.004860
32	0.049537	0.001694	0.001605	0.001547
64	0.029627	0.000548	0.000574	0.000454
128	0.025555	0.000186	0.000216	0.000186
256	0.015685	0.000075	0.000070	0.000053
512	0.012246	0.000027	0.000027	0.000022
1024	0.009555	0.000007	0.000009	0.000011
2048	0.006514	0.000003	0.000003	0.000005
4096	0.003950	0.000001	0.000001	0.000000

Both LHS variants produce the correct $O(N^{-1})$ convergence rate and nearly
identical absolute errors, confirming that the Feistel-network permutation is a
drop-in replacement for the Fisher-Yates shuffle used by scipy (and by the
previous stochastic_tools implementation) with no measurable quality penalty.

zachmprince · 2026-04-24T19:27:50Z

Here is a plot of the experiment results. The LHS lines (blue and red) are difficult to see since they are right on top of each other.

moosebuild · 2026-04-24T19:33:40Z

Job Precheck, step Python: black format on af7f601 wanted to post the following:

Python black formatting

Your code requires style changes.

A patch was generated and copied here.

You can directly apply the patch by running the following at the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32833/black/black.patch | git apply -v

Alternatively, you can run the following at the top level of your repository:

black --config pyproject.toml --workers 1 .

moosebuild · 2026-04-24T19:45:30Z

Job Precheck, step Clang format on f298501 wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32833/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format 162e78c71709df5be85663d216c89383cf162192

Refs idaholab#32194 Closes idaholab#32775

…aholab#32194

…32194

…lab#32194

Trivial regolding was required on timestep 1 due to previous inconsistency in what the decision reporters were outputting as inputs. Refs idaholab#32194

…makes more sense now idaholab#32194

…umber of rows idaholab#32775

moosebuild · 2026-04-24T23:50:53Z

Job Documentation, step Docs: sync website on 8af031b wanted to post the following:

View the site here

This comment will be updated on new commits.

moosebuild · 2026-04-27T23:44:01Z

Job Coverage, step Generate coverage on 8af031b wanted to post the following:

Framework coverage

	5b391d	#32833 8af031
	Total	Total	+/-	New
Rate	85.87%	85.87%	-0.00%	100.00%
Hits	132500	132528	+28	71
Misses	21797	21808	+11	0

Diff coverage report

Full coverage report

Modules coverage

Stochastic tools

	5b391d	#32833 8af031
	Total	Total	+/-	New
Rate	90.68%	90.69%	+0.02%	100.00%
Hits	8636	8615	-21	136
Misses	888	884	-4	0

Diff coverage report

Full coverage report

Full coverage reports

Reports

This comment will be updated on new commits.

…memory stays reasonable idaholab#32194

…lab#32194

lindsayad

Framework portion looks good. I'll let @grmnptr handle the stochastic tools review

grmnptr

Very nice! Just a few questions.

grmnptr · 2026-05-04T19:05:48Z

+   * the result falls within [0, n).
+   *
+   * @param x  Input index; must satisfy x < n.
+   * @return   A unique index in [0, n). Calling permute for every x in [0, n)


So doing this multiple times for avoiding the padding is just equivalent to doing more than 8 rounds? So more expensive but still defendable? Could it introduce distortions close to the padding depending on the round function? Or once well-mixed, it doesn't matter?

Here is the response from claude:

Cycle-walking is not equivalent to more rounds. More rounds deepen the mixing within a single application of the Feistel network — they improve avalanche diffusion so that each call produces a better-scrambled output. Cycle-walking instead applies the complete 8-round network repeatedly to different intermediate values: permutePadded(permutePadded(x)) runs 8 rounds on x to get y, then 8 fresh rounds on y. These are unrelated computations. So yes, it costs more, but it isn't buying you better mixing of the original input — it's purely a correctness mechanism.

No distortions near the padding boundary. This is the key point. Because permutePadded is a bijection on the full padded domain [0, 2^(2*half_bits)), cycle-walking is equivalent to: take the permutation of the padded domain, discard every element that lands in the padding region [n, 2^(2*half_bits)), and keep only those that land in [0, n). The bijectivity guarantees every value in [0, n) is reached exactly once as you iterate over all starting points in [0, n) — you get a valid, unbiased permutation of [0, n). There is no correlation between proximity to the padding boundary and the distribution of outputs, because the round function's mixing (mix32 + key mixing + round-dependent constant) ensures no structure survives from the padded arithmetic.

Expected cost is bounded. The padded size is the smallest 2^(2k) >= n, so it's at most 4n. The expected number of cycle-walk iterations is padded_size / n <= 4, usually much less. For LHS with reasonable n, the overhead is negligible.

grmnptr · 2026-05-04T19:48:43Z

+}
+
+uint32_t
+MooseRandomPerturbation::roundFunction(uint32_t half, unsigned int round) const


Why this round function?

I mean I suppose the function itself doesnt matter too much, but this form must have come from somewhere and there must be a reason why this is used over everything else.

Apparently, this is a standard function to use:

0x9e3779b9U is the 32-bit fractional part of the golden ratio: floor((φ−1) × 2³²). It appears in Knuth's multiplicative hash, xxHash, TEA/XTEA, and many others as a "nothing up my sleeve" constant with provably good bit distribution. Multiplying by (round + 1) makes the additive constant distinct each round — without this, pairs of rounds could cancel algebraically (a slide attack), so you'd get no net mixing from repeated even rounds.

mix32 is the degski/Murmur3-finalizer avalanche hash. The specific constants 0x7feb352dU and 0x846ca68bU were found by automated search (Pelle Evensen's work) to maximize the strict avalanche criterion — roughly, each input bit flips ~half the output bits. The xor-shift → multiply → xor-shift → multiply → xor-shift pattern is the standard way to build a bijective 32-bit finalizer; bijective matters here because it means mix32 contributes no collisions.

The overall structure — x ^= k0; x += constant*(round+1); x ^= k1; x = mix32(x) — is a lightweight keyed hash construction similar to a TEA round: key mixing, round diversification, key mixing again, then avalanche. It doesn't need to be cryptographically strong (this is for LHS, not encryption); it needs to be fast, key-dependent so different seeds give independent permutations, round-dependent so the Feistel network doesn't degenerate, and well-avalanching so the half-block outputs look uniform. This construction satisfies all four.

- Removing redundant half mask for left split integer - Removing forgotten line in PCMCBase Refs #327194, idaholab#32775

moosebuild · 2026-05-05T18:20:11Z

Job Test, step Results summary on 8af031b wanted to post the following:

Framework test summary

Compared against 5b391df in job civet.inl.gov/job/3787211.

No added tests

Run time changes

Test	Base (s)	Head (s)	+/-	Base (MB)	Head (MB)
`problems/reference_residual_problem.zero_tolerance_ref`	3.97	6.24	+57.19%	147.95	138.62

Modules test summary

Compared against 5b391df in job civet.inl.gov/job/3787211.

No added tests

Run time changes

Test	Base (s)	Head (s)	+/-	Base (MB)	Head (MB)
`solid_mechanics/test:smeared_cracking.rz_exponential`	11.03	18.17	+64.79%	126.14	136.47
`combined/test:combined_plasticity_temperature.ad_temp_dep_yield-jac`	6.55	10.49	+60.09%	173.21	187.12
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_isokinharden_symmetric_strain_controlled`	10.13	16.14	+59.25%	150.94	140.31
`solid_mechanics/test:rate_independent_cyclic_hardening.1D_ratcheting_nonlin_kinharden_stress_controlled`	3.73	5.89	+57.96%	116.25	123.91
`solid_mechanics/test:rate_independent_cyclic_hardening.linear_kinharden_nonsymmetric_stress_controlled`	4.34	6.85	+57.79%	122.60	127.67
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_kinharden_symmetric_strain_controlled`	4.70	7.37	+56.78%	126.33	127.33
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_kinharden_nonsymmetric_strain_controlled`	5.80	9.04	+55.77%	130.95	132.75
`solid_mechanics/test:temperature_dependent_hardening.test`	2.17	3.38	+55.52%	121.70	121.92
`solid_mechanics/test:combined_creep_plasticity.creepWithPlasticity`	5.24	8.13	+55.10%	121.92	122.41
`stochastic_tools/test:transfers/sampler_reporter.transfer/normal`	6.76	10.44	+54.51%	158.47	176.41
`solid_mechanics/test:smeared_cracking.exponential`	3.96	6.10	+54.10%	123.71	121.94
`solid_mechanics/test:beam/static.euler_finite_y_with_action`	3.13	4.80	+53.54%	121.64	136.42
`solid_mechanics/test:beam/static.euler_finite_rot_y`	3.11	4.78	+53.39%	126.28	128.48
`solid_mechanics/test:torque_reaction.disp_about_axis_motion`	2.00	3.06	+52.89%	118.95	120.02
`solid_mechanics/test:combined_creep_plasticity.combined_start_time`	5.72	8.74	+52.84%	118.07	125.55
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_x`	3.82	5.84	+52.81%	136.85	149.29
`solid_mechanics/test:beam/static.euler_finite_rot_z`	3.32	5.07	+52.75%	124.22	138.74
`solid_mechanics/test:rate_independent_cyclic_hardening.linear_kinharden_symmetric_strain_controlled`	4.83	7.33	+51.91%	144.16	141.68
`solid_mechanics/test:crystal_plasticity/cp_eigenstrains.thermal_eigenstrain_011orientation`	5.38	8.15	+51.47%	124.00	133.14
`stochastic_tools/test:transfers/sampler_reporter.transfer/distributed`	3.27	4.94	+51.28%	472.19	517.81
`solid_mechanics/test:combined_creep_plasticity.stress_prescribed`	3.47	5.25	+51.26%	124.49	124.43
`solid_mechanics/test:dynamics/acceleration_bc.acceleration_bc_ti`	3.45	5.21	+50.95%	130.91	134.51
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_z`	3.87	5.83	+50.75%	132.21	134.29
`solid_mechanics/test:dynamics/wave_1D.rayleigh_hht_ad_jac`	3.86	5.80	+50.50%	119.75	135.05
`solid_mechanics/test:combined_creep_plasticity.combined`	5.84	8.78	+50.37%	120.34	123.52
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_isoharden_symmetric_strain_controlled`	3.37	5.06	+50.20%	126.03	129.53
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_xz`	3.95	5.93	+50.12%	132.19	139.47
`stochastic_tools/test:reporters/BFActiveLearning.sampling_bf/MultipleProc_MultipleRow_Ufunction`	9.62	4.39	-54.38%	1314.27	282.41

zachmprince added 2 commits April 24, 2026 12:35

Implementation of Feistel mixing for pseudo-random perturbation idaho…

0dbeffe

…lab#32194

zachmprince force-pushed the new_latin_sampling branch from d0d56ce to af7f601 Compare April 24, 2026 19:26

zachmprince force-pushed the new_latin_sampling branch 3 times, most recently from 5ba8c63 to 9bff5a0 Compare April 24, 2026 21:08

zachmprince added 13 commits April 24, 2026 15:44

Converting some of the easy samplers to using stateless RNG

b3b8551

Refs idaholab#32194 Closes idaholab#32775

Removing sampleSetUp dependence of ActiveLearningMonteCarloSampler id…

78bd8a7

…aholab#32194

Using correct sample index in AdaptiveImportanceSampler idaholab#32194

2ff398f

Removing sampleSetUp dependence of ParallelSubsetSimulation idaholab#…

73da1c8

…32194

Removing sampleSetUp dependence in GenericActiveLearningSampler idaho…

40a8a1f

…lab#32194

Removing dependence on sampleSetUp in all parallel MCMC samplers

1000fb6

Trivial regolding was required on timestep 1 due to previous inconsistency in what the decision reporters were outputting as inputs. Refs idaholab#32194

Fixing bi-fidelity active learning idaholab#32194

e06941a

Removing now unused Sampler methods idaholab#32194

2091a88

Making sure LHS is executed before requesting samples idaholab#32194

5d2f35a

Needed to regold dynamic sampler tests because generator advancement …

a4db8ba

…makes more sense now idaholab#32194

Regolding all the tests that use LHS idaholab#32194

3f75bdb

Removing more unneeded sampler methods idaholab#32194

2294df6

Error test was hanging due to advancing generators with a ludicrous n…

d498eb1

…umber of rows idaholab#32775

zachmprince force-pushed the new_latin_sampling branch from 9bff5a0 to d498eb1 Compare April 24, 2026 21:44

Reducing number of random seeds in MCMC and active learning tests so …

87c06da

…memory stays reasonable idaholab#32194

zachmprince force-pushed the new_latin_sampling branch from 16a857a to 87c06da Compare April 28, 2026 14:32

Recruiting AdaptiveImportanceSampler into the executeSetUp fold idaho…

b86aa2e

…lab#32194

zachmprince marked this pull request as ready for review April 28, 2026 19:45

zachmprince requested review from grmnptr and lindsayad as code owners April 28, 2026 19:45

lindsayad reviewed Apr 28, 2026

View reviewed changes

Comment thread framework/include/samplers/Sampler.h Outdated

Comment thread framework/include/utils/MooseRandomPerturbation.h

zachmprince added 2 commits April 30, 2026 10:59

Updating subtleties in Sampler doxygen idaholab#32194 idaholab#32775

2d9487f

Moving MooseRandomPerturbation methods to a source file idaholab#32194

79ac28e

GiudGiud assigned zachmprince Apr 30, 2026

grmnptr reviewed May 4, 2026

View reviewed changes

Addressing initial grmnptr review:

8af031b

- Removing redundant half mask for left split integer - Removing forgotten line in PCMCBase Refs #327194, idaholab#32775

Conversation

zachmprince commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reason

Design

MooseRandomPerturbation (new framework utility)

Redesigned LatinHypercubeSampler

Sampler base-class cleanup

MCMC and active-learning samplers

Stateless conversion for remaining samplers

Impact

Framework API changes

To-Do

Uh oh!

zachmprince commented Apr 24, 2026

LHS quality experiment: Feistel permutation vs. Fisher-Yates shuffle

Uh oh!

zachmprince commented Apr 24, 2026

Uh oh!

moosebuild commented Apr 24, 2026

Python black formatting

Uh oh!

moosebuild commented Apr 24, 2026

Uh oh!

moosebuild commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moosebuild commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Framework coverage

Modules coverage

Stochastic tools

Full coverage reports

Uh oh!

lindsayad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

grmnptr left a comment

Choose a reason for hiding this comment

Uh oh!

grmnptr May 4, 2026

Choose a reason for hiding this comment

Uh oh!

zachmprince May 5, 2026

Choose a reason for hiding this comment

Uh oh!

grmnptr May 4, 2026

Choose a reason for hiding this comment

Uh oh!

grmnptr May 4, 2026

Choose a reason for hiding this comment

Uh oh!

zachmprince May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moosebuild commented May 5, 2026

Framework test summary

No added tests

Run time changes

Modules test summary

No added tests

Run time changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zachmprince commented Apr 24, 2026 •

edited

Loading

`MooseRandomPerturbation` (new framework utility)

Redesigned `LatinHypercubeSampler`

moosebuild commented Apr 24, 2026 •

edited

Loading

moosebuild commented Apr 27, 2026 •

edited

Loading