Modification to DRL to support parallel sampler gathering/bounded actions. by grmnptr · Pull Request #32802 · idaholab/moose

grmnptr · 2026-04-22T14:47:07Z

Closes #32511

…tic reporters.

…llection.

…the trainer/control object. (idaholab#32511)

…l based run.

moosebuild · 2026-04-28T22:34:28Z

Job Precheck, step Clang format on 3b80ab4 wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32802/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format 953c577aaeb4885a251cedbbe9d1bc8819be587e

moosebuild · 2026-04-28T23:26:15Z

Job Test, step Results summary on ef3ee6e wanted to post the following:

Framework test summary

Compared against 953c577 in job civet.inl.gov/job/3779372.

No change

Modules test summary

ERROR: Results do not exist for event 292341

lindsayad

Just reviewing the framework part

lindsayad · 2026-04-28T23:04:58Z

+  /// Update cached affine metadata vectors from the registered libtorch buffers.
+  void synchronizeAffineFactorsFromBuffers();
+
+  /**
+   * Map an activation name to the orthogonal-initialization gain we want to use.
+   * @param activation Activation name to look up.
+   */


It's the wild-west for doxygen comment structure. We should get something in our style guide about this at some point

In general, I try to do the slashes for short comments and the asterisk for longer ones. But I never thought about defining what is short and what is long.

I'll make this a bit more uniform.

I don't blame you. It's a reasonable heuristic and maybe that's the one we'll end up putting in the style guide. Generally I've always done the block comment structure for methods and then /// for data. But as this is not in the style guide, I can't say what I do is the right way

lindsayad · 2026-04-28T23:06:44Z

+
+  /**
+   * Initialize the trainable weights and biases.
+   * @param generator Optional torch random-number generator used for reproducible initialization.


@zachmprince should we just use libtorch for random number generation? This is in reference to your recent PR. The cost would be a no-longer-optional dependency. Possible gain would be reduced code maintenance and overall less code duplication across the OSS ecosystem? I defer to you two on this. I'm not an expert in this area

lindsayad · 2026-04-28T23:13:25Z


 void to_json(nlohmann::json & json, const Moose::LibtorchArtificialNeuralNet * const & network);

+void loadLibtorchArtificialNeuralNetState(Moose::LibtorchArtificialNeuralNet & nn,


doxygen please

lindsayad · 2026-04-28T23:16:58Z

+  // File-backed controllers are loaded after full construction so derived controls can override
+  // the loader without constructor-time type checks.


that's a good thing? Constructor-time type checks sound nice

lindsayad · 2026-04-28T23:18:35Z

+ * @param archive Archive being read.
+ * @param key Serialized tensor name.
+ * @param tensor Tensor that receives the loaded data.
+ * @return True when the tensor was found and loaded.


Suggested change

* @return True when the tensor was found and loaded.

* @return whether the tensor was found and loaded.

lindsayad · 2026-04-28T23:21:20Z

+ * @param nn Neural network that receives the loaded state.
+ * @param filename Checkpoint file to read.
+ * @param error Human-readable error string filled on failure.
+ * @return True when the network was loaded successfully.


Suggested change

* @return True when the network was loaded successfully.

* @return whether the network was loaded successfully.

lindsayad · 2026-04-28T23:21:59Z

+ * @param nn Neural network that receives the loaded state.
+ * @param filename Checkpoint file to read.
+ * @param error Human-readable error string filled on failure.
+ * @return True when the network was loaded successfully.


Suggested change

* @return True when the network was loaded successfully.

* @return whether the network was loaded successfully.

lindsayad · 2026-04-28T23:23:07Z

+void
+LibtorchArtificialNeuralNet::initializeNeuralNetwork(const c10::optional<at::Generator> generator)
+{
+  for (unsigned int i = 0; i < numHiddenLayers(); ++i)


Suggested change

for (unsigned int i = 0; i < numHiddenLayers(); ++i)

for (const auto i : make_range(numHiddenLayers()))

lindsayad · 2026-04-28T23:26:04Z

+    const std::vector<std::vector<Real>> & component_trajectories,
+    const unsigned int time_index) const
+{
+  validateTrajectoryShape(component_trajectories);


Can an invalid state be reached through user input or would this be a developer error?

Well, I added this check to make sure we have the right sizes because at the beginning I had some invalid timestep execute on settings etc. But this might be too restrictive because this could block the usage of adaptive timestepping. I will check what I can do. Part of me was also thinking about adaptive timesteps not being usable with multiple input timestep stacking but I suppose that depends on the problem. I think I can make this less restrictive, and I should.

lindsayad · 2026-04-28T23:26:48Z

 PointValue::execute()
 {
  _value = _system.point_value(_var_number, _point, false);
-


I like this new line 😄

Damn, I had a print statement here for some not-so-advanced debugging and I accidentally removed one too many new lines.

grmnptr added 14 commits April 21, 2026 10:57

Add source modifications to the trainer to be hable to handle stochas…

8103262

…tic reporters.

Move examples to stochastic reporters.

5b72067

Fix indexing issue in reward to go.

67231ed

Add sampler transfer.

bfa7fa6

Transition toward sampler and change seeds per application in this case.

74f5e12

Fix normalization problem.

4442115

Check sensitivity to inputs, understand multi worker behavior.

8ad6850

Add reward pp

ace272b

Add files for vortex shedding example.

abe70b9

Add option for smoothing signal.

a45d92c

Adopt training for vortex shedding.

4f06dcd

Add run-ready files.

3a0a78d

Add actor network

b17f831

Add different distributions.

e950f37

grmnptr force-pushed the drl-mods branch from 4bc1ca9 to 9b7324f Compare April 23, 2026 22:24

grmnptr added 15 commits April 25, 2026 15:48

Add control option without sampling.

2abba0d

Extend reward PP

fbac7cd

Add plotting script.

e16279b

Save what happens.

d6990d0

Adopt to new changes.

b0a9231

Apply formatting, add beta logprob fix.

77baa1b

Split out action distribution (close to policy) from the drltrainer.

5226140

Split out observation history from the drl trainer.

4f75b66

Simplify the actor neural net.

6ac3663

Simplify the control neural net in the framework.

baf2a56

Add loss object, minibatch selector, and buffer for organized data co…

68da054

…llection.

Simplify the drl control object.

5dcb70b

Simplify the trainer.

10b2974

Add unit tests for the controller.

559a99d

Move the scaling to the network itself instead of always doing it on …

661d053

…the trainer/control object. (idaholab#32511)

grmnptr added 14 commits April 25, 2026 15:48

Separate Beta and Gaussian distributions.

d0fc79a

Move action distribution to STM.

38db960

Remove old framework implementation for distribution heads.

563c6c2

Remove meltpool example, remove unused postprocessors.

c8f445c

remive min max parameters from the basic ann.

5e28ce3

Remove the vortex control example.

408dc12

Make loading nicer, move setup to initial setup.

c9c637a

Remove normalization for the initialization.

30b7146

Add docstrings.

3007e7d

Make the random number generation consistent in DRL.

2c32e39

Restrict sampler controller transfer.

0af111c

Rename response to observation, make sure we can recover a DRL contro…

b2e8646

…l based run.

Add more docstrings.

c295c2f

Simplify observationhistory.

199386e

grmnptr force-pushed the drl-mods branch from a70c6df to 199386e Compare April 25, 2026 21:51

grmnptr added 5 commits April 28, 2026 08:15

Rename some object to make more sense.

5458d59

Remove LiftDragRewardPostprocessor from drl-mods

1dad4e1

More docstrings, cleanup remove testing PPs.

e17d7f8

Finish first round of cleanup.

3df5d51

Add modification for the documentation as well.

62dbc5e

grmnptr marked this pull request as ready for review April 28, 2026 22:30

grmnptr requested review from GiudGiud, lindsayad and zachmprince as code owners April 28, 2026 22:30

Add proper format.

ef3ee6e

grmnptr force-pushed the drl-mods branch from 3b80ab4 to ef3ee6e Compare April 28, 2026 22:50

lindsayad reviewed Apr 28, 2026

View reviewed changes

GiudGiud assigned grmnptr Apr 30, 2026


		void to_json(nlohmann::json & json, const Moose::LibtorchArtificialNeuralNet * const & network);

		void loadLibtorchArtificialNeuralNetState(Moose::LibtorchArtificialNeuralNet & nn,

		// File-backed controllers are loaded after full construction so derived controls can override
		// the loader without constructor-time type checks.

	* @return True when the tensor was found and loaded.
	* @return whether the tensor was found and loaded.

	* @return True when the network was loaded successfully.
	* @return whether the network was loaded successfully.

	for (unsigned int i = 0; i < numHiddenLayers(); ++i)
	for (const auto i : make_range(numHiddenLayers()))

Conversation

grmnptr commented Apr 22, 2026

Uh oh!

moosebuild commented Apr 28, 2026

Uh oh!

moosebuild commented Apr 28, 2026

Framework test summary

Modules test summary

Uh oh!

lindsayad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grmnptr Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

grmnptr Apr 29, 2026 •

edited

Loading