Add AGENTS.md by lindsayad · Pull Request #32749 · idaholab/moose

lindsayad · 2026-04-10T19:33:42Z

Refs #32497

@idaholab/moose-ccb let's work on this together

cticenhour · 2026-04-10T20:57:32Z

We just did something similar to this in TMAP8. It might be helpful to take a look at some of what we put down too!

https://github.com/idaholab/TMAP8/blob/devel/AGENTS.md

cticenhour · 2026-04-10T20:58:33Z

Pinging @chaibhave @simopier @lin-yang-ly

lindsayad · 2026-04-10T21:08:09Z

I should have said in the original PR post that this is modeled after PETSc's CLAUDE.md, or at least it was prior to https://gitlab.com/petsc/petsc/-/merge_requests/9192 at which point they converted from linking to human-targeted markdown files to making the agents file completely self-contained. The reasoning is based on https://gitlab.com/petsc/petsc/-/merge_requests/9184#note_3224501851. So probably the current state of this AGENTS.md is not good.

I have read the TMAP8 AGENTS.md and there is good content there for sure. There are other things which are not my favorite. The ones I remember off the top of my head:

prompt for an issue number. This can often be good, but I don't actually think every commit needs to reference an issue and I would often ask the agent to do something, switch windows, and then come back and see that it hadn't done anything because I hadn't given it an issue number
reference to conda environment activation. I often don't use conda or I may use conda with an environment that isn't named moose
the addition of the json files logging agent sessions. I don't think that's helpful for users closing the repository. I could see it possibly be helpful for a review and maybe even somewhere down the road if someone has to try and understand the code ... but if your hope is to try and reproduce the output of an LLM, that is a stochastic process and these LLMs are changing all the time

Those are relatively small things

moosebuild · 2026-04-10T21:54:39Z

Job Documentation, step Docs: sync website on 827e110 wanted to post the following:

View the site here

This comment will be updated on new commits.

moosebuild · 2026-04-10T23:02:02Z

Job Coverage, step Generate coverage on 827e110 wanted to post the following:

Framework coverage

Coverage did not change

Modules coverage

Coverage did not change

Full coverage reports

Reports

This comment will be updated on new commits.

grmnptr · 2026-04-24T17:51:17Z

Would it be beneficial to add some info on OS-specific behavior? Like conda-related info for macs?

grmnptr · 2026-04-24T17:53:45Z

Oh I see you didn't like it, I think it could be still useful.

grmnptr · 2026-04-24T17:59:08Z

I suppose we can cutomize it locally on top of the base info here. Was also thinking about the sandbox behavior on mac.

lindsayad · 2026-04-27T16:48:00Z

Oh I see you didn't like it, I think it could be still useful.

I don't think we should assume conda, but we could instruct the agent from agents file to begin a session by asking

should I attempt to build/run-tests during our session?
if yes, do I need to activate a conda environment?

lindsayad · 2026-04-27T20:43:01Z

Ok in newest commit:

inlined a bunch of inline guidance for the agent in case they don't follow doc links
told the agent that at startup they should ask about building/testing and if so what conda env is relevant

lindsayad · 2026-04-27T20:54:17Z

I think we should make it a goal to get this merged reasonably soon. I think this is one of those things that once it's merged we should get a bunch of follow-on improvements since all of a sudden it will be affecting people's agents' use

lindsayad · 2026-04-27T20:55:08Z

I guess I'm trying to motivate collaboration here. We don't want a piece of garbage getting merged, so I won't rush to get this merged

maxnezdyur · 2026-04-28T17:14:52Z

What if instead of AGENTS.md we put a .agents/skills/
folder at the repo root and symlink .claude/skills/ to
it? .agents/skills/ is actually the cross-agent convention for codex and the symlink is for claude. Each skill is just a folder with a SKILL.md, so we could have things like:

moose-code-standards/ for C++/Python style
moose-run-tests/ for how to actually invoke run_tests
from the right subtree
moose-build/ how to build moose with different versions

I feel like this would be less intrusive then a AGENTS.md file.

lindsayad · 2026-04-28T17:18:06Z

I fully support adding skills. That's included in #32497. Perhaps that can fully replace an AGENTS.md (I've symlinked CLAUDE.md to point to AGENTS.md in this PR), but it would lead me to wonder why other OSS projects are introducing these files and why claude itself has asked me in the past to write a CLAUDE.md

maxnezdyur · 2026-04-28T17:28:25Z

I feel that the claude/agents.md file is better setup by each person individually depending on what they work on in MOOSE most often, but I can see the benefit. I wonder if creating a CONTEXT.md file that a user can point to in their claude/agents.md file would be better. Then it would be more opt in type of thing.

MengnanLi91 · 2026-04-28T17:36:04Z

I agree with @maxnezdyur. A lot of the content may be better to add as skills so that agent will load them when needed instead of load agent.md every time as system prompt. Here is an example of claude.md recommended by Andrej Karpathy in his tweet https://github.com/forrestchang/andrej-karpathy-skills/blob/main/CLAUDE.md

lindsayad · 2026-04-28T21:57:14Z

That's a great looking file @MengnanLi91.

I feel that the claude/agents.md file is better setup by each person individually depending on what they work on in MOOSE most often, but I can see the benefit. I wonder if creating a CONTEXT.md file that a user can point to in their claude/agents.md file would be better. Then it would be more opt in type of thing.

I hear what you're saying @maxnezdyur. But I want to take steps to ensure that people are submitting code to MOOSE that is reviewable. @nmnobre caught some code that codex wrote (with my signature attached) which was more complicated than it needed to be. As agent use increases (I firmly believe it will), we'll continue to see more agent written code. It would be best for everyone, I firmly believe, if that agent code is standardized as much as possible. And there's no reason it shouldn't be; we write things like style guides that we expect human developers to follow.

I think the simple guidelines in the CLAUDE.md that @MengnanLi91 linked to is a good starting point. I think it would be better to check that in and ask people to opt-out if they wish for some reason. They can keep their own files that they use to append or to overwrite the checked-in file

Refs idaholab#32497

lindsayad · 2026-04-28T22:06:37Z

It would be great if you smarter AI people could add the skills 😄

roystgnr · 2026-04-29T18:53:54Z

The most heavily-AI-written PR I've reviewed so far is libMesh/libmesh#4441, and some of the mistakes there were specific to libMesh or specific to that PR's goals, but others might be worth trying to preempt here:

MOOSE has its own coding standards. "Don't try to use camel case where libMesh uses snake case" doesn't apply here, but maybe the generalization is "Look at framework_scs.md or at least this summary here". This may mean we need to read our standards more closely, though, to look for advice that currently expects some human judgement and make sure it can safely be treated literally instead. See Fix virtual destructor advice #32857 for one I noticed at first glance.
Big PRs are much easier to review when that can be done a commit at a time. "Put all bootstrap output in its own commit" is libMesh-specific, but there are guidelines like "earlier commits should be functional without depending on later commits", "each commit should be as small as possible while still adding functionality", "if achieving these goals ever requires rewriting git history, that should be done in a separate branch for safety", etc. that can apply more generally.
PRs are even easier to review when they can be done a PR at a time. If adding feature A requires adding features B and C, that can and sometimes should be done by starting with PR B and PR C before PR A, but that's really not obvious without guidance.
AI really likes to write its own code, and doesn't seem to see having multiple different copies of the same functionality as a maintenance cost. We might need some guideline about reusing existing code whereever it works out of the box, and about proposing adding features to and/or refactoring whereever it doesn't; redundancy should be a last resort.
AI really likes to make sure that required tests are all passing, and most of the labs have been working at getting rid of the obvious Goodhart's Law effects of that (writing code to pass specific tests in a way that obviously breaks in general, rewriting tests to be easier to pass, refusing to open the pod bay doors, etc.), but more subtle pressures are even harder (not writing enough new tests to fully cover new features, writing tests that only test the first-order effects of code but not their desired consequences or interactions). I'm not sure how to write a good guideline about this one, I admit.

lindsayad · 2026-05-03T03:13:49Z

Those are all good suggestions and I think I had the framing of some of those things earlier but moved to the simpler AGENTS.md based on the feedback from @maxnezdyur and @MengnanLi91, with the alternative seemingly being to put things into skills.

Note that putting a link to the coding style guide in AGENTS.md does not mean in general that the agent will read it. That was the reason that the PETSc team totally changed their agents file to stop using links

GiudGiud · 2026-05-04T13:16:19Z

I'm seeing the bot delete all the comments on SCM AI-generated PRs. It's moving code around but deleting the comments while doing that. Can we expressively forbid that?

GiudGiud · 2026-05-04T16:04:46Z

I am also seeing people using AI agents to address reviews and they tag the reviewer in the commit message.
This is quite undesirable as iirc the tagged person would get spammed from anyone modifying that commit (for example in regular and wild rebases) on a public branch

So maybe we should have a line in agents.md about that?

roystgnr · 2026-05-04T17:36:34Z

Those are all good suggestions and I think I had the framing of some of those things earlier but moved to the simpler AGENTS.md

I put the most simplified version I could into 948c941 in https://github.com/roystgnr/moose/tree/agents-32497

I left out any reference to our coding standards; there's no way to get a simple but adequate summary there, and I need to read a bit about SKILL.md philosophy before trying to translate to that.

- Remove changeset section as this is a better instruction for humans at this time - Remove the first few lines which don't really tell the agent any helpful information

moosebuild · 2026-05-05T00:10:26Z

Job Test, step Results summary on 827e110 wanted to post the following:

Framework test summary

Compared against 5b391df in job civet.inl.gov/job/3787211.

No added tests

Run time changes

Test	Base (s)	Head (s)	+/-	Base (MB)	Head (MB)
`problems/reference_residual_problem.zero_tolerance_ref`	3.97	6.02	+51.47%	147.95	155.49

Modules test summary

Compared against 5b391df in job civet.inl.gov/job/3787211.

No added tests

Run time changes

Test	Base (s)	Head (s)	+/-	Base (MB)	Head (MB)
`solid_mechanics/test:smeared_cracking.rz_exponential`	11.03	18.49	+67.65%	126.14	130.92
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_isokinharden_symmetric_strain_controlled`	10.13	16.65	+64.37%	150.94	136.00
`solid_mechanics/test:temperature_dependent_hardening.test`	2.17	3.56	+63.72%	121.70	122.79
`solid_mechanics/test:rate_independent_cyclic_hardening.linear_kinharden_nonsymmetric_stress_controlled`	4.34	7.10	+63.62%	122.60	127.76
`solid_mechanics/test:combined_creep_plasticity.creepWithPlasticity`	5.24	8.56	+63.38%	121.92	124.49
`solid_mechanics/test:dynamics/acceleration_bc.acceleration_bc`	3.47	5.60	+61.25%	135.66	131.25
`solid_mechanics/test:combined_creep_plasticity.combined_start_time`	5.72	9.21	+61.05%	118.07	124.68
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_kinharden_nonsymmetric_strain_controlled`	5.80	9.32	+60.74%	130.95	129.55
`solid_mechanics/test:rate_independent_cyclic_hardening.1D_ratcheting_nonlin_kinharden_stress_controlled`	3.73	5.97	+60.25%	116.25	127.32
`stochastic_tools/test:transfers/sampler_reporter.transfer/normal`	6.76	10.83	+60.14%	158.47	163.54
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_z`	3.87	6.19	+59.89%	132.21	139.39
`solid_mechanics/test:combined_creep_plasticity.stress_prescribed`	3.47	5.50	+58.56%	124.49	124.52
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_x`	3.82	6.05	+58.53%	136.85	138.58
`solid_mechanics/test:rate_independent_cyclic_hardening.linear_kinharden_nonsymmetric_strain_controlled`	10.14	16.04	+58.21%	130.89	136.53
`solid_mechanics/test:rate_independent_cyclic_hardening.linear_kinharden_symmetric_strain_controlled`	4.83	7.63	+58.00%	144.16	130.20
`combined/test:adaptive_timestepping.test_function_change`	3.49	5.52	+57.89%	160.62	167.12
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_isoharden_symmetric_strain_controlled`	3.37	5.32	+57.73%	126.03	122.89
`solid_mechanics/test:beam/static.euler_finite_rot_y`	3.11	4.91	+57.50%	126.28	128.41
`solid_mechanics/test:beam/static.euler_finite_rot_z`	3.32	5.18	+56.21%	124.22	142.77
`solid_mechanics/test:combined_creep_plasticity.combined`	5.84	9.10	+55.90%	120.34	124.01
`solid_mechanics/test:torque_reaction.disp_about_axis_motion`	2.00	3.12	+55.89%	118.95	116.62
`combined/test:combined_plasticity_temperature.ad_temp_dep_yield-jac`	6.55	10.20	+55.61%	173.21	168.12
`solid_mechanics/test:rate_independent_cyclic_hardening.nonlin_kinharden_symmetric_strain_controlled`	4.70	7.29	+55.05%	126.33	129.35
`solid_mechanics/test:crystal_plasticity/cp_eigenstrains.thermal_eigenstrain_011orientation`	5.38	8.33	+54.80%	124.00	147.16
`solid_mechanics/test:dynamics/acceleration_bc.acceleration_bc_ti`	3.45	5.34	+54.69%	130.91	129.71
`solid_mechanics/test:torque_reaction.disp_about_axis_axial_motion_delayed`	2.02	3.12	+54.42%	114.45	124.77
`solid_mechanics/test:beam/static.euler_finite_y_with_action`	3.13	4.83	+54.36%	121.64	132.43
`solid_mechanics/test:smeared_cracking.exponential`	3.96	6.10	+54.19%	123.71	127.26
`solid_mechanics/test:smeared_cracking.cracking_rotation`	3.97	6.10	+53.85%	135.74	132.60
`solid_mechanics/test:smeared_cracking.xyz`	2.82	4.32	+53.06%	129.79	133.48
`solid_mechanics/test:crystal_plasticity/cp_eigenstrains.thermal_eigenstrain`	4.12	6.31	+53.02%	125.07	133.09
`solid_mechanics/test:lagrangian/cartesian/total/rates.green_naghdi_shear`	2.07	3.16	+52.67%	117.22	115.79
`solid_mechanics/test:dynamics/wave_1D.rayleigh_hht_ad_jac`	3.86	5.88	+52.63%	119.75	125.04
`combined/test:combined_plasticity_temperature.temp_dep_yield`	2.06	3.15	+52.58%	148.95	153.15
`solid_mechanics/test:smeared_cracking.cracking_rotation_pres_dir_xz`	3.95	6.01	+52.19%	132.19	132.60
`stochastic_tools/test:transfers/sampler_reporter.transfer/distributed`	3.27	4.96	+51.68%	472.19	539.42
`solid_mechanics/test:crystal_plasticity/stress_update_material_based.rotation_matrix_update_euler_angle_011_orientation`	2.57	3.88	+51.18%	120.92	127.53
`solid_mechanics/test:crystal_plasticity/cp_eigenstrains.multiple_eigenstrains`	4.14	6.23	+50.40%	134.89	122.45

roystgnr · 2026-05-05T15:36:23Z

One more thing Rochi noticed: although Codex would readily add abort statements to failure cases (e.g. default in switch statements), Claude preferred to return 0;

IMHO we obviously want the former, in devel/dbg modes at the very least even for device code, because at best "return 0" gives you a much later and harder to diagnose failure and at worst you silently get wrong answers, but apparently this is something agents might need instruction for.

hugary1995 · 2026-05-05T17:44:54Z

My two cents: I think part of this shouldn't be built into AGENTS.md or similar instructions. Why don't we enforce 100% code coverage now that it's so easy for agents to do TDD. Accounting every possible potential pitfall in agent instructions is just too much prompt engineering and hard to maintain in the long run.

Now, the real problem is we don't have truly flexible unit testing capability in MOOSE.

hugary1995 · 2026-05-05T17:47:26Z

One more thing Rochi noticed: although Codex would readily add abort statements to failure cases (e.g. default in switch statements), Claude preferred to return 0;

Things like this keeps evolving as LLMs iterate. I agree with you we prefer the former, but on the other hand I'm not sure if it's wise to build this into the agent instructions.

GiudGiud · 2026-05-05T17:52:52Z

Now, the real problem is we don't have truly flexible unit testing capability in MOOSE.

Do you have an issue where we could build a wish list / reflexion?
I'm already adding unit testing for actions in a PR. This will cover a lot of the parameter exception testing needs for example

hugary1995 · 2026-05-05T17:53:12Z

Another way of viewing this problem is that the agent instructions are harness around the LLM, not the LLM itself. With that view, we should also be aware of the fact that the agent instructions is one specific type of harness. We should ask this question of "Is AGENTS.md the best type of harness for this?". My personal experience is that in many cases the answer is no. A simple example is our license header enforcement -- the best harness is apparently a pre-commit hook + CI enforcement.

lindsayad · 2026-05-06T00:12:18Z

What do you think the best type of harness is for an agent who would like to write 200 lines of code workaround for an issue in a submodule instead of a one line fix in the submodule? The agents instructions is an effective guard against this in my experience. Preventing this at the time of code writing is effective in my opinion. It shouldn't be the job of the reviewer to point this out. I'm a little frustrated at the lack of progress on this pull request. MOOSE maintainers are going to be more and more bombarded with agent assisted PRs over time. If we can limit the number of times a maintainer has to write "there's a lot of code duplication here; why don't you use this method that already does that" or "why'd you delete all these comments" I think that's a very good thing for preventing maintainer burnout and frustration.

I don't really care if the initial PR is perfect

hugary1995 · 2026-05-06T00:24:59Z

I think this is a very fair example of something that belongs in agent instructions. Where I’m still hesitant is calling AGENTS.md the best harness rather than one layer of the harness. For this particular case, we could probably pair AGENTS.md with other mechanisms such as static-analysis and maybe code review agents for detecting unusually large workarounds near existing APIs.

I'm not opposed to including this particular guidance. I’m more trying to avoid treating AGENTS.md as the universal solution.

You see I didn't reject the PR and request changes. We can move forward and keep improving.

lindsayad · 2026-05-06T17:41:10Z

Just waiting on an approval 😄

MengnanLi91

Looks good to me. Thanks for leading this effort! I wonder if we should add MOOSE coding style type of information in the future version.

lindsayad marked this pull request as ready for review April 10, 2026 19:33

moosebuild added the PR: Failed but allowed label Apr 10, 2026

moosebuild removed the PR: Failed but allowed label Apr 27, 2026

zachmprince reviewed Apr 27, 2026

View reviewed changes

Comment thread AGENTS.md Outdated

lindsayad force-pushed the agents-32497 branch from cb7578f to 2863aaa Compare April 27, 2026 23:39

moosebuild added the PR: Failed but allowed label Apr 28, 2026

Use simple AGENTS.md file provided by Andrej Karpathy

df4edab

Refs idaholab#32497

lindsayad force-pushed the agents-32497 branch from 2863aaa to df4edab Compare April 28, 2026 22:05

moosebuild removed the PR: Failed but allowed label Apr 28, 2026

GiudGiud assigned lindsayad Apr 30, 2026

More instructions aiming at simplicity

948c941

lindsayad added 3 commits May 4, 2026 15:58

Pare down

88889bf

- Remove changeset section as this is a better instruction for humans at this time - Remove the first few lines which don't really tell the agent any helpful information

Talk about submodules

b1c2d29

Add short code comments section

827e110

moosebuild added the PR: Failed but allowed label May 5, 2026

MengnanLi91 approved these changes May 6, 2026

View reviewed changes

lindsayad merged commit a0caaa3 into idaholab:next May 6, 2026
68 checks passed

lindsayad deleted the agents-32497 branch May 6, 2026 17:50

Conversation

lindsayad commented Apr 10, 2026

Uh oh!

cticenhour commented Apr 10, 2026

Uh oh!

cticenhour commented Apr 10, 2026

Uh oh!

lindsayad commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moosebuild commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moosebuild commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Framework coverage

Modules coverage

Full coverage reports

Uh oh!

grmnptr commented Apr 24, 2026

Uh oh!

grmnptr commented Apr 24, 2026

Uh oh!

grmnptr commented Apr 24, 2026

Uh oh!

lindsayad commented Apr 27, 2026

Uh oh!

lindsayad commented Apr 27, 2026

Uh oh!

lindsayad commented Apr 27, 2026

Uh oh!

lindsayad commented Apr 27, 2026

Uh oh!

Uh oh!

maxnezdyur commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lindsayad commented Apr 28, 2026

Uh oh!

maxnezdyur commented Apr 28, 2026

Uh oh!

MengnanLi91 commented Apr 28, 2026

Uh oh!

lindsayad commented Apr 28, 2026

Uh oh!

lindsayad commented Apr 28, 2026

Uh oh!

roystgnr commented Apr 29, 2026

Uh oh!

lindsayad commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GiudGiud commented May 4, 2026

Uh oh!

GiudGiud commented May 4, 2026

Uh oh!

roystgnr commented May 4, 2026

Uh oh!

moosebuild commented May 5, 2026

Framework test summary

No added tests

Run time changes

Modules test summary

No added tests

Run time changes

Uh oh!

roystgnr commented May 5, 2026

Uh oh!

hugary1995 commented May 5, 2026

Uh oh!

hugary1995 commented May 5, 2026

Uh oh!

GiudGiud commented May 5, 2026

Uh oh!

hugary1995 commented May 5, 2026

Uh oh!

lindsayad commented May 6, 2026

Uh oh!

hugary1995 commented May 6, 2026

Uh oh!

lindsayad commented Apr 10, 2026 •

edited

Loading

moosebuild commented Apr 10, 2026 •

edited

Loading

moosebuild commented Apr 10, 2026 •

edited

Loading

maxnezdyur commented Apr 28, 2026 •

edited

Loading

lindsayad commented May 3, 2026 •

edited

Loading

MengnanLi91 left a comment •

edited

Loading