Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions nemo_rl/models/generation/vllm/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ class VllmSpecificArgs(TypedDict):
# Miscellaneous top level vLLM HTTP server arguments.
# A filepath that can be imported to register a vLLM tool parser
tool_parser_plugin: NotRequired[str]
# Ensures message history over multiple turns remains constant. Useful for on-policy training, but sometimes
# we may want to drop previous reasoning (like Qwen3), agent context management,
# sliding window, or drop past environment state observations, etc
enforce_monotonicity: NotRequired[bool]
Comment on lines 42 to 45
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for enforce_monotonicity in exemplar YAMLs
rg -n 'enforce_monotonicity' examples/ -g '*.yaml' -g '*.yml'

Repository: NVIDIA-NeMo/RL

Length of output: 40


🏁 Script executed:

#!/bin/bash
# Find exemplar YAML files in the repo to understand structure
find examples/configs -type f \( -name "*.yaml" -o -name "*.yml" \) 2>/dev/null | head -20

Repository: NVIDIA-NeMo/RL

Length of output: 856


🏁 Script executed:

#!/bin/bash
# Check the config file to see how other NotRequired config keys are documented
cat -n nemo_rl/models/generation/vllm/config.py | head -80

Repository: NVIDIA-NeMo/RL

Length of output: 3068


Add enforce_monotonicity to exemplar YAMLs and document valid values and recommended default.

The new config key is missing from exemplar YAML files under examples/configs/ and its documentation is incomplete. Per coding guidelines, new TypedDict config keys must document purpose, valid values/types, and recommended default, and this default must be reflected in exemplar YAMLs.

Update the field comment to specify valid values (True/False) and recommended default, then add the key to relevant exemplar YAML files (e.g., grpo_math_*.yaml, dpo.yaml) with the recommended value.

🤖 Prompt for AI Agents
In `@nemo_rl/models/generation/vllm/config.py` around lines 42 - 45, Update the
TypedDict field comment for enforce_monotonicity in
nemo_rl/models/generation/vllm/config.py to state that it's a boolean
(True/False), describe its purpose succinctly, and specify the recommended
default (e.g., False); then add the enforce_monotonicity: false entry to the
exemplar YAMLs in examples/configs (notably the grpo_math_*.yaml variants and
dpo.yaml) so the exemplar configs reflect the documented default. Ensure the
comment mentions valid values (True/False) and the recommended default, and that
the YAML keys use the same name enforce_monotonicity with the recommended value.



class VllmConfig(GenerationConfig):
Expand Down
5 changes: 5 additions & 0 deletions nemo_rl/models/generation/vllm/vllm_worker_async.py
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,8 @@ def _setup_vllm_openai_api_server(self, app: FastAPI) -> FastAPI:
openai_serving_models_kwargs["model_config"] = model_config
openai_serving_models = OpenAIServingModels(**openai_serving_models_kwargs)

enforce_monotonicity = self.cfg["vllm_cfg"].get("enforce_monotonicity", True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid code-side default; rely on YAML for enforce_monotonicity.

Using .get(..., True) sets a non-None default in code. Please set the default in YAML and read the value directly (or assert presence) here.

✅ Suggested change
-        enforce_monotonicity = self.cfg["vllm_cfg"].get("enforce_monotonicity", True)
+        enforce_monotonicity = self.cfg["vllm_cfg"]["enforce_monotonicity"]

As per coding guidelines, "YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
enforce_monotonicity = self.cfg["vllm_cfg"].get("enforce_monotonicity", True)
enforce_monotonicity = self.cfg["vllm_cfg"]["enforce_monotonicity"]
🤖 Prompt for AI Agents
In `@nemo_rl/models/generation/vllm/vllm_worker_async.py` at line 333, The line
that reads enforce_monotonicity =
self.cfg["vllm_cfg"].get("enforce_monotonicity", True) should not provide a
code-side default; instead read the value directly from the config and fail fast
if missing. Replace the .get call with direct indexing (e.g.,
self.cfg["vllm_cfg"]["enforce_monotonicity"]) or add an explicit assertion that
"enforce_monotonicity" exists in self.cfg["vllm_cfg"], and ensure the YAML
config defines the default there.


class NeMoRLOpenAIChatRequestMixin:
def model_post_init(self, context):
# NeMo-Gym specific processing. This is just how NeMo-Gym returns the extra token information.
Expand Down Expand Up @@ -384,6 +386,9 @@ async def _preprocess_chat(
add_special_tokens,
)

if not enforce_monotonicity:
return res

if request.required_prefix_token_ids is None:
return res

Expand Down
Loading