Conversation
Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com>
Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com>
| site/ | ||
| site/ | ||
| site/ | ||
| site/ | ||
| site/ |
There was a problem hiding this comment.
site/ is duplicated 5 times
| site/ | |
| site/ | |
| site/ | |
| site/ | |
| site/ | |
| site/ |
| @@ -0,0 +1,377 @@ | |||
| --- | |||
| date: 2025-09-05 | |||
There was a problem hiding this comment.
The date 2025-09-05 is in the future. Should be a valid past date
| skip = True | ||
| if skip and os.path.exists(output_path): | ||
| os.remove(output_path) |
There was a problem hiding this comment.
The variable skip is hardcoded to True, which unconditionally deletes the output file on every run. This destroys the resume functionality mentioned on line 136. Either remove this code block or make skip a CLI argument
| from src.tools import ( | ||
| ALL_TOOLS, | ||
| Check_Alarm_Status, | ||
| Check_Apply_Configuration, | ||
| Check_Element_Health, | ||
| Check_Element_Neighbors, | ||
| Check_External_Issues, | ||
| Check_Performance, | ||
| Check_remote_files, | ||
| Create_Ticket, | ||
| Execute_Remote_Action, | ||
| Orchestration_tool, | ||
| Triage_Toolkit_Tool, | ||
| ) |
There was a problem hiding this comment.
These imports reference src.tools but the script is located in recipes/noc-reasoning-agent/scripts/. Check that PYTHONPATH is set correctly before running, otherwise imports will fail
| react_agent = create_react_agent(chat_llm, tools=[], checkpointer=checkpointer) | ||
| for index, row in tqdm(unprocessed_df.iterrows(), total=len(unprocessed_df), desc="Processing Rows"): | ||
| # print(row) | ||
| # exit() | ||
|
|
||
| # --- Step 2: Get the row-specific tools using the helper function. --- | ||
| # row_tools = get_bound_tools(row) | ||
|
|
||
| config = {"configurable": {"thread_id": str(index)}} | ||
| input_question = row["input"] | ||
| system_message = row["system"] | ||
| input_messages = [{"role": "system", "content": system_message}, {"role": "user", "content": input_question}] | ||
|
|
||
| try: | ||
| # We still invoke one-by-one, but the tool creation is now efficient. | ||
|
|
||
| ## WE need to fix this. Add system message and tool calling messages | ||
| # separate user input messages | ||
| TOOLS = { | ||
| "Check_Alarm_Status": lambda args: Check_Alarm_Status(**args), | ||
| "Check_Element_Neighbors": lambda args: Check_Element_Neighbors(**args), | ||
| "Check_Element_Health": lambda args: Check_Element_Health(**args), | ||
| "Execute_Remote_Action": lambda args: Execute_Remote_Action(**args), | ||
| "Check_External_Issues": lambda args: Check_External_Issues(**args), | ||
| "Check_Apply_Configuration": lambda args: Check_Apply_Configuration(**args), | ||
| "Check_Performance": lambda args: Check_Performance(**args), | ||
| "Create_Ticket": lambda args: Create_Ticket(**args), | ||
| "Orchestration_tool": lambda args: Orchestration_tool(**args), | ||
| "Triage_Toolkit_Tool": lambda args: Triage_Toolkit_Tool(**args), | ||
| "Check_remote_files": lambda args: Check_remote_files(**args), | ||
| } |
There was a problem hiding this comment.
Created ReAct agent with empty tools=[] on line 174, then manually maps tool calls on lines 192-204. Consider using LangGraph's built-in tool binding (as shown in create_agent_with_tools.py) to simplify this logic
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a complete NOC Reasoning Agent recipe: docs, prompts, configs, SFT data-prep, synthetic-data pipelines, domain tool stubs, ReAct agent runners (single & batch), numerous data-processing/evaluation/visualization scripts, schema/constants, and recipe-level .gitignore; top-level Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DataPrep as Data Prep Pipeline
participant SFT as SFT Config
participant Model as Qwen Model
participant Trainer as Training Engine
User->>DataPrep: provide CSV incidents
DataPrep->>DataPrep: filter, categorize, convert to JSONL
DataPrep->>SFT: prepared JSONL manifest
SFT->>SFT: apply filters, dedupe, shuffle, write manifest
SFT->>Model: deliver formatted SFT manifest
Model->>Trainer: tokenize & fine-tune (Megatron/vLLM)
Trainer->>User: trained model artifact
sequenceDiagram
participant Runner as Agent Runner
participant Memory as Checkpointer
participant LM as Language Model
participant ToolExec as Tool Executor
participant Tools as Domain Tools
participant Output as JSONL Output
Runner->>Memory: load processed indices / resume
Runner->>LM: send system+user prompt
LM->>Runner: return generation (tool_call or conclusion)
alt tool_call
Runner->>ToolExec: parse tool_call
ToolExec->>Tools: execute bound tool(s)
Tools->>ToolExec: return tool responses
ToolExec->>Runner: provide observation
Runner->>LM: append observation and continue
else conclusion
Runner->>Memory: mark processed
Runner->>Output: append response (row_index, input, result)
end
sequenceDiagram
participant Eval as Evaluation Pipeline
participant Metrics as Metric Calculators
participant LLMJudge as LLM Judge
participant Out as Output JSONL
Eval->>Eval: read agent responses + expected
Eval->>Metrics: compute ROUGE & BERTScore
Metrics->>Eval: return metric scores
Eval->>LLMJudge: request conclusion & reasoning score
LLMJudge->>Eval: return score + rationale
Eval->>Out: write augmented JSONL with metrics and judges
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py`:
- Around line 132-135: The code currently hardcodes skip = True which causes
deletion of output_path every run and breaks resume logic; change this so skip
is not hardcoded—make it a CLI flag, environment variable, or default False—and
only call os.remove(output_path) when skip is explicitly set by the user; ensure
the block around skip, os.path.exists(output_path), and os.remove(output_path)
is gated by the configurable skip variable and leave
load_processed_indices(output_path) unchanged so resume logic
(load_processed_indices / processed_indices) can work when skip is False.
- Around line 156-157: The call to AutoModelForCausalLM.from_pretrained in
create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so
the intended precision is ignored; update the call to pass the dtype via
torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified
bfloat16/float16 precision, ensuring you modify the
AutoModelForCausalLM.from_pretrained(...) invocation accordingly.
In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py`:
- Line 12: The parser.add_argument("--output_file", help="Path to output") can
leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file,
...) to raise a ValueError; update the argument declaration
(parser.add_argument) to either set required=True or provide a sensible default
(e.g., default="evaluation_results.json"), and/or add a guard before writing
that validates args.output_file is not None and raises a clear error message;
ensure references to args.output_file and the DataFrame.to_json call are updated
accordingly.
In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py`:
- Around line 153-166: The code currently uses two regex matches (m and
m_no_args) and then unconditionally calls m.groups(), which creates dead code
and a latent crash; instead, after performing the initial match m =
re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is
truthy and if not immediately return the original string (or handle the error),
and only then call m.groups() to assign tool_name and raw_args; remove the
separate m_no_args branch entirely so empty brackets are handled by the first
regex and m.groups() is always called on a valid match.
In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py`:
- Around line 207-276: The __main__ block contains leftover debug logic that
hardcodes csv_file/model, calls AutoTokenizer directly (unsafe if transformers
missing), and calls exit() before main(), making main() unreachable; remove or
replace the entire if __name__ == "__main__": block so it simply parses
CLI/default args and invokes main(), ensure any tokenizer creation uses the
existing guarded import/creation logic (e.g., reuse the module's
AutoTokenizer-safe factory or wrap AutoTokenizer.from_pretrained in the same
try/except used elsewhere), eliminate hardcoded paths and stray exit() calls,
and wire csv_file, column_name, model, output_csv, output_plot into main()
parameters or argparse so the script runs via main() correctly.
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`:
- Around line 26-71: The function parse_final_reasoning contains an early return
and debug prints that make the intended parsing logic unreachable; remove the
premature return and the print statements and restructure the function to (1)
first find the final "Finish" block using trace_block_match or last_index as a
fallback, (2) set finish_text deterministically (return a clear error string if
neither is found), (3) run the step_pattern regex against that finish_text to
extract Thought/Action/Observation entries, format them into parsed_steps, and
finally return the joined string; refer to parse_final_reasoning,
trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when
making these changes.
- Around line 74-88: The function find_finish_action assigns finish_text only
when last_index != -1 but always returns finish_text, causing UnboundLocalError
when "Finish" isn't found; update find_finish_action to handle the not-found
case by returning a safe default (e.g., None or empty string) or raising a clear
exception, and ensure finish_text is defined in both branches (use last_index,
generation_text, and finish_text to locate/construct the return value or the
fallback).
🟠 Major comments (22)
recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-33-34 (1)
33-34:⚠️ Potential issue | 🟠 MajorStub silently does nothing — user gets no feedback when using a non-default
--llm.If a user passes
--llmwith any value other than"qwen2.5-32b-instruct",generate_synthetic_data_oss_gptis called and silently completes without action. This violates the principle that errors should not pass silently. Either raiseNotImplementedErroror restrict the--llmchoices.♻️ Suggested fix (option A: raise error)
def generate_synthetic_data_oss_gpt(args, cluster, num_gpus): - pass + raise NotImplementedError("OSS GPT generation path is not yet implemented")♻️ Suggested fix (option B: restrict choices)
- parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation") + parser.add_argument( + "--llm", type=str, default="qwen2.5-32b-instruct", + choices=["qwen2.5-32b-instruct"], + help="The LLM to use for generation", + )As per coding guidelines, "Follow the Zen of Python principles: … ensure errors never pass silently unless explicitly silenced".
Also applies to: 57-62
recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-43-55 (1)
43-55:⚠️ Potential issue | 🟠 Major
--gpusargument is defined but never used;--num_gpusduplicates its purpose.A user setting
--gpus 4would have no effect — only--num_gpusis read (line 54). Additionally, line 55 misleadingly prints"specified via --gpus". Remove the unused--gpusargument.♻️ Suggested fix
- # parser.add_argument('--gpus', type=int, default=detect_num_gpus(), - # help='Number of GPUs to use for generation') - parser.add_argument("--gpus", type=int, default=8, help="Number of GPUs to use for generation") parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation") parser.add_argument( "--num_gpus", type=int, default=8, help="Number of GPUs to use (auto-detected if not specified)" ) args = parser.parse_args() cluster = "local" num_gpus = args.num_gpus - print(f"Using {num_gpus} GPUs (specified via --gpus)") + print(f"Using {num_gpus} GPUs (specified via --num_gpus)")As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-47-56 (1)
47-56: 🛠️ Refactor suggestion | 🟠 MajorCompute all enrichment before writing the output file.
The current code reads, enriches, and writes line-by-line in a single pass. If an error occurs mid-processing (e.g., malformed JSON on line N), a partial output file is left behind. Separate the read/compute phase from the write phase.
Proposed fix
-enriched_rows = [] -with open(jsonl_path, "r", encoding="utf-8") as fin, open(output_jsonl_path, "w", encoding="utf-8") as fout: +enriched_rows = [] +with open(jsonl_path, "r", encoding="utf-8") as fin: for line in fin: if not line.strip(): continue row = json.loads(line) row = enrich_row(row, csv_idx) enriched_rows.append(row) - fout.write(json.dumps(row, ensure_ascii=False) + "\n") + +with open(output_jsonl_path, "w", encoding="utf-8") as fout: + for row in enriched_rows: + fout.write(json.dumps(row, ensure_ascii=False) + "\n")As per coding guidelines, "perform all computations before re-opening files for writing to avoid accidental data loss if code fails during execution".
recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-200-212 (1)
200-212: 🛠️ Refactor suggestion | 🟠 MajorUse direct dictionary access instead of
.get()— key is guaranteed to exist.
"Service-off"is defined inPROBLEM_CODE_REASONING_PROCESSat line 2, so.get("Service-off", "")masks a potential future error if the key were removed. Direct access["Service-off"]will fail fast with a clearKeyError.♻️ Proposed fix
-DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "") +DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS["Service-off"] SYNTHETIC_FAULT_CATEGORY_REASONING = { - "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), + "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], }As per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present; use direct dictionary accessdict[key]instead to allow proper error handling and fail fast with clear errors".recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-35-42 (1)
35-42: 🛠️ Refactor suggestion | 🟠 MajorFile is read and parsed twice; reuse the existing DataFrame.
The JSONL file is loaded identically at lines 9–12 and 36–39, and
dffrom line 15 already holds the same data. Remove the second read and reuse the existingdf.♻️ Proposed fix
-# Load JSONL file into a list -data = [] -with open(file_path, "r", encoding="utf-8") as f: - for line in f: - data.append(json.loads(line)) - -# Convert to DataFrame -df = pd.DataFrame(data) - # Select only the relevant columns columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"]recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-121-125 (1)
121-125:⚠️ Potential issue | 🟠 MajorBERTScore is computed per-row — extremely slow due to repeated model loading.
bert_score()is called once per row (line 122), which reloads the model each time. For a dataset of any meaningful size, this will be orders of magnitude slower than a single batched call. Collect all pairs first and callbert_score()once.⚡ Sketch of batched approach
# Collect all pairs first references = [] candidates = [] for index, row in df.iterrows(): # ... extract generated_reasoning_part and expected content ... candidates.append(generated_reasoning_part) references.append(conclusion_expected + reasoning_expected) # Single batched BERTScore call P, R, F1 = bert_score(candidates, references, lang="en", verbose=True) # Then iterate again to assemble output rows with F1[i].item()recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-221-293 (1)
221-293: 🛠️ Refactor suggestion | 🟠 MajorFile is read twice and contains ~35 lines of commented-out code.
The input file is read manually (lines 224–226) and then again via
pd.read_json(line 233). The first read is entirely superseded by the DataFrame path (line 293 overwritesdata). Remove the first read. Also, the large commented-outselection_criteriablock (lines 246–280) and scattered debug comments should be cleaned up before merging.recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-28-29 (1)
28-29:⚠️ Potential issue | 🟠 MajorO(n) membership checks on lists — use sets for train/test partitions.
train_setandtest_setare lists, sonumber in train_set(line 42) andnumber in test_set(line 57) are O(n) per lookup. With potentially thousands of incidents across multiple iteration files, this becomes a significant bottleneck.⚡ Proposed fix
- train_set = incidents[:n_train] - test_set = incidents[n_train:] + train_set = set(incidents[:n_train]) + test_set = set(incidents[n_train:])Also applies to: 42-42, 57-57
recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-192-218 (1)
192-218:⚠️ Potential issue | 🟠 MajorUnescaped values injected into HTML summary.
Lines 198–199 insert
incident_data.get(...)values directly into the HTML without callingesc(), unlike the rest of the template. This is an XSS vector if incident data contains HTML-special characters.🛡️ Proposed fix
- <li>Category: {incident_data.get("incident_classification", incident_data.get("category"))}</li> - <li>Problem Code: {incident_data.get("fault_category", incident_data.get("u_problem_code"))}</li> + <li>Category: {esc(incident_data.get("incident_classification", incident_data.get("category")))}</li> + <li>Problem Code: {esc(incident_data.get("fault_category", incident_data.get("u_problem_code")))}</li>recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-50-51 (1)
50-51:⚠️ Potential issue | 🟠 Major
\thinkshould be<think>to match the pipeline's expected tag format.The YAML prompt files use
\think, butsplit_incident_data.py(line 67) injectsbackground = "\n<think>\n"with angle brackets. The training configuration (noc_reasoning_sft.yaml,noc_reasoning_sft_6.yaml) also filters for and expects XML-style<think>and</think>tags (e.g.,"<think>thinking</think>summary"). In YAML block scalars,\thinkis treated as literal text, not an escape sequence, so it won't match the<think>tags injected by the pipeline. This issue also appears inprompt_reasoning.yaml(line 166) andshortened_prompt_reasoning.yaml(line 147).recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-9-10 (1)
9-10:⚠️ Potential issue | 🟠 MajorRelative imports require package initialization; this will fail with ImportError.
from .reasoning_processes import ...andfrom .schema_columns import ...(lines 9-10) perform relative imports, which require the parent directory to be a Python package (with__init__.py). There is no__init__.pyinrecipes/noc-reasoning-agent/scripts/utils/,scripts/, or any parent directory. Running the script directly withpython create_input_jsonl_from_incidents.pywill raiseImportError: attempted relative import with no known parent package.All other scripts in this repository use absolute imports; switch to those or add
__init__.pyfiles to makeutils/andscripts/proper packages.recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-93-98 (1)
93-98: 🛠️ Refactor suggestion | 🟠 MajorDuplicated CSV loading with overly broad
except Exception— extract a helper.The
try: read_csv(encoding="latin1") except Exception: read_csv()pattern is repeated four times. The broadexcept Exceptioncatches everything (e.g.,FileNotFoundError,MemoryError) when only encoding errors are expected.Extract a helper and narrow the exception:
Suggested refactor
+def _load_csv(path: str) -> pd.DataFrame: + """Load CSV, falling back from latin1 to default encoding.""" + try: + return pd.read_csv(path, encoding="latin1") + except UnicodeDecodeError: + return pd.read_csv(path) + + def filter_auto(input_csv, output_csv): """Filter incident data and save results to a new CSV.""" - try: - df = pd.read_csv(input_csv, encoding="latin1") - except Exception: - df = pd.read_csv(input_csv) + df = _load_csv(input_csv)As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
Also applies to: 116-121, 131-135, 147-151
recipes/noc-reasoning-agent/scripts/tools.py-13-20 (1)
13-20:⚠️ Potential issue | 🟠 Major
Check_Alarm_Statusis missing the_safe_tool_valwrapper, unlike every other tool function.All other tool functions pass through
_safe_tool_val()to sanitize NaN/None values, butCheck_Alarm_Statusreturns the raw.get()result directly. If the row containsNaNfor the"Check_Alarm_Status"key,.get()returns theNaNvalue (since the key exists), which will not be JSON-safe.def Check_Alarm_Status(row: pd.Series, site_or_element_id: str) -> str: ... - return row.get("Check_Alarm_Status", "NotApplicable") + return _safe_tool_val(row.get("Check_Alarm_Status", "NotApplicable"))recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-30-30 (1)
30-30:⚠️ Potential issue | 🟠 MajorReplace
assertwith a proper check for runtime validation.
assertstatements are stripped in optimized mode (python -O), so this validation would silently disappear. Use a conditional raise instead.- assert len(tool_calls) == len(tool_response) + if len(tool_calls) != len(tool_response): + raise ValueError( + f"Mismatched tool_call/tool_response counts: {len(tool_calls)} vs {len(tool_response)}" + )As per coding guidelines, "ensure errors never pass silently unless explicitly silenced."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-28-35 (1)
28-35:⚠️ Potential issue | 🟠 MajorSilent
except Exception: continueswallows all errors during resume loading.This catches and silently ignores any exception (including unexpected ones like
PermissionError,MemoryError, etc.) while parsing the output file. Per the coding guidelines, don't catch exceptions not normally expected. The batch variant (create_agent_with_tools_batch.pyline 69) correctly catches onlyjson.JSONDecodeError.- except Exception: - # ignore malformed lines so we can keep going - continue + except json.JSONDecodeError: + print(f"[WARNING] Skipping malformed line in output file: {line}")As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-92-211 (1)
92-211:⚠️ Potential issue | 🟠 MajorHeavy objects recreated on every loop iteration — significant performance waste.
llm_with_tools,MemorySaver(), andcreate_react_agent(...)are all created inside the per-row loop (lines 209-211). The tool closures are also redefined on every iteration (lines 102-205). Since the model and LLM wrappers don't change between rows, the agent and tool binding should be constructed once before the loop.The tool functions can capture
rowvia a factory function to avoid the B023 closure issue flagged by static analysis, while still being defined once.Suggested restructuring (sketch)
+ # Build agent once outside the loop + checkpointer = MemorySaver() + for index, row in tqdm(df.iterrows()): if index in processed_indices: continue input_question = row.get("input", "") print(f"Processing input {index + 1}/{len(df)}") - `@tool` - def Check_Alarm_Status(site_or_element_id: str) -> str: - ... - # ... all tool definitions ... - - tools = [...] - llm_with_tools = chat_llm.bind_tools(tools) - checkpointer = MemorySaver() - react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer) + # Define row-bound tools using a factory or partial + tools = _make_tools(row) + llm_with_tools = chat_llm.bind_tools(tools) + react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer)Note: Reusing the
MemorySaveroutside the loop is safe; thethread_idinconfigalready isolates per-row state.recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-10-12 (1)
10-12:⚠️ Potential issue | 🟠 Major
_incident_idcan silently returnNone, corrupting downstream data.If neither
incident_identifiernornumberis present, this returnsNone. ThatNoneis then used as a dictionary key throughout (responses[number],formatted_steps_taken[number],full_steps[number]), silently merging unrelated incidents under a singleNonekey. This is a data integrity issue.Proposed fix
def _incident_id(data): """Synthetic schema uses incident_identifier; legacy uses number.""" - return data.get("incident_identifier") or data.get("number") + incident_id = data.get("incident_identifier") or data.get("number") + if incident_id is None: + raise ValueError(f"Missing incident identifier in data: {list(data.keys())}") + return incident_idrecipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-63-69 (1)
63-69: 🛠️ Refactor suggestion | 🟠 MajorBare
Exceptioncatch silently swallows unexpected errors.Line 66 catches all exceptions and just prints them, masking bugs in the parsing logic. Per coding guidelines, exceptions that are not normally expected should not be caught — let them propagate so issues surface clearly. The outer
JSONDecodeErrorcatch on line 68 is fine.Proposed fix — remove the bare catch
except json.JSONDecodeError as e: print(text) print(f"Error decoding JSON: {e}") - except Exception as e: - print(f"An unexpected error occurred: {e}")As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-314-323 (1)
314-323: 🛠️ Refactor suggestion | 🟠 MajorBroad
except Exceptionsilently discards incidents on any error.This catches and suppresses all errors — including programming bugs in
token_convertingor data access issues — and quietly removes the affected incidents from the output. This makes debugging very difficult and violates the fail-fast principle.Consider catching only the specific expected exception (e.g.,
KeyError,ValueError) or at minimum, logging the incident data that caused the failure.Proposed fix
try: for i in range(len(steps_taken[number])): if steps_taken[number][i]["tool_call"]: steps_taken[number][i]["tool_call"] = token_converting( steps_taken[number][i]["tool_call"], model ) steps_taken[number][i]["thinking"] = reasoning_steps[number][steps_taken[number][i]["step_number"]] - except Exception as e: - print(e) + except (KeyError, ValueError) as e: + print(f"Error merging steps for incident {number}: {e}") broken_numbers.append(number)As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-445-448 (1)
445-448:⚠️ Potential issue | 🟠 MajorCase C silently drops intermediate non-tool-call steps — likely a bug.
When a step has no
tool_calland is not the final step,step_textis computed but never appended tocurrent_assistant_content. The comment on line 447 states "We already added to current_assistant_content at the top of loop" but no such addition exists. This means intermediate reasoning steps are lost from the conversation context, and their thinking/actions won't appear in subsequentbackgroundfields.If these steps should accumulate context, the content needs to be appended:
Proposed fix (if accumulation is intended)
# --- CASE C: Intermediate Step (just accumulation) --- else: - # We already added to current_assistant_content at the top of loop - pass + # Accumulate intermediate thinking into the conversation context + current_assistant_content.append( + {"role": "assistant", "content": step_text} + )recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-85-87 (1)
85-87:⚠️ Potential issue | 🟠 MajorSilent data corruption when
"final<|message|>"delimiter is missing.If
textdoes not contain"final<|message|>",rfindreturns-1, and the slice becomestext[15:](i.e.,-1 + 16), silently producing garbage data for the thinking content with no warning.Proposed fix
- thinking = text[text.rfind("final<|message|>") + len("final<|message|>") :] + delimiter = "final<|message|>" + idx = text.rfind(delimiter) + if idx == -1: + print(f"Warning: delimiter '{delimiter}' not found for incident {number}, step {step_number}") + continue + thinking = text[idx + len(delimiter) :]recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-476-489 (1)
476-489: 🛠️ Refactor suggestion | 🟠 MajorSame broad
except Exceptionpattern — masks bugs during compilation.Same issue as in
merge_reasoning_steps. This silently counts programming errors as "incorrect incidents" and continues, making it hard to distinguish data issues from code bugs.Proposed fix
- except Exception as e: - print(f"Error for incident {number}: {e}") + except (KeyError, ValueError) as e: + print(f"Error for incident {number}: {e}", flush=True) incorrect_incidents += 1As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
🟡 Minor comments (29)
recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-94-94 (1)
94-94:⚠️ Potential issue | 🟡 MinorOutput key
"problem code"contains a space, inconsistent with all other keys.All other output keys use underscores (e.g.,
u_region2,short_description). This looks like an accidental deviation. If intentional (matching downstream expectations), please confirm; otherwise rename to"problem_code".- "problem code": d1.get("u_problem_code", ""), + "problem_code": d1.get("u_problem_code", ""),recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-60-71 (1)
60-71:⚠️ Potential issue | 🟡 MinorSame
.get("number")and silent-skip concern in file2 loading.Apply the same fix as suggested for the file1 block above.
Suggested fix
d = json.loads(line) - num = d.get("number") - if num: - data2[num] = d.get("generation", "") - except json.JSONDecodeError as e: - print(f"Error decoding JSON in file2 on line {line_num}: '{line}'. Error: {e}") + num = d["number"] + data2[num] = d.get("generation", "") + except (json.JSONDecodeError, KeyError) as e: + raise ValueError(f"Error processing file2 on line {line_num}: '{line}'. Error: {e}") from erecipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-46-57 (1)
46-57:⚠️ Potential issue | 🟡 MinorUse direct dictionary access for expected keys instead of
.get().Lines 53 and 67 use
d.get("number")then silently skip records where the key is missing. If"number"is a required field in these JSONL files, direct access (d["number"]) would fail fast with a clearKeyError, which is preferable. Additionally, theJSONDecodeErrorcatch-and-print on lines 56–57 (and 70–71) silently drops malformed lines, which can mask data corruption.Suggested fix for file1 loading
d = json.loads(line) - num = d.get("number") - if num: - data1[num] = d - except json.JSONDecodeError as e: - print(f"Error decoding JSON in file1 on line {line_num}: '{line}'. Error: {e}") + num = d["number"] + data1[num] = d + except (json.JSONDecodeError, KeyError) as e: + raise ValueError(f"Error processing file1 on line {line_num}: '{line}'. Error: {e}") from eAs per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present" and "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-60-64 (1)
60-64:⚠️ Potential issue | 🟡 MinorMisleading section comment: says
solved_categorybut code usesresolution_method/close_code.Line 60 says "Bucket by solved_category" but line 63 derives the category from
resolution_methodorclose_code, notsolved_category. This will confuse future readers.Proposed fix
-# ---------- Bucket by solved_category ---------- +# ---------- Bucket by resolution_method / close_code ----------recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-32-44 (1)
32-44:⚠️ Potential issue | 🟡 MinorUse direct dictionary access for guaranteed keys; validate
numis not None.Two issues in
enrich_row:
Line 33: If the row contains neither
"incident_identifier"nor"number",numsilently becomesNone, and the lookup silently fails — producing records withNonefields and no indication of a data problem. This should raise an error.Lines 41-43:
infois sourced fromcsv_idx(built on line 28 from the validated columnstime_to_resolve,solved_category,solved_reason), so these keys are guaranteed to exist. Per coding guidelines, use direct dictionary access instead of.get().Proposed fix
def enrich_row(row, lookup): - num = row.get("incident_identifier", row.get("number")) + if "incident_identifier" in row: + num = row["incident_identifier"] + elif "number" in row: + num = row["number"] + else: + raise KeyError(f"Row missing both 'incident_identifier' and 'number': {row}") info = lookup.get(num) if info is None: - # No match: set to None (or choose sensible defaults) row["time_to_resolve"] = None - row["solved_category"] = row.get("solved_category") # preserve if already present - row["solved_reason"] = row.get("solved_reason") + row["solved_category"] = row.get("solved_category") + row["solved_reason"] = row.get("solved_reason") else: - row["time_to_resolve"] = info.get("time_to_resolve") - row["solved_category"] = info.get("solved_category") - row["solved_reason"] = info.get("solved_reason") + row["time_to_resolve"] = info["time_to_resolve"] + row["solved_category"] = info["solved_category"] + row["solved_reason"] = info["solved_reason"] return rowAs per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present; use direct dictionary accessdict[key]instead to allow proper error handling and fail fast with clear errors" and "ensure errors never pass silently unless explicitly silenced"..gitignore-52-56 (1)
52-56:⚠️ Potential issue | 🟡 MinorRemove duplicate
site/entries.Five identical
site/ignore patterns were added. Git only needs one entry per pattern, so the duplicates serve no purpose and suggest an accidental copy-paste or merge error.🧹 Proposed fix to remove duplicates
.idea site/ -site/ -site/ -site/ -site/recipes/noc-reasoning-agent/configs/config.ini-1-10 (1)
1-10:⚠️ Potential issue | 🟡 MinorConfig references appear stale relative to the tutorial.
The
[download]section listsqwen2.5-14,qwen2.5-32,gpt-oss-120b, andnemotron-49b-1.5, but the tutorial (noc-reasoning-agent.md) trains and evaluates with Qwen3-32B — which is absent here. Similarly,original_data_pathpoints todata/anonymized-Incidents_Last_6_Months.csvwhile the tutorial instructs users to place data asdata/synthetic_incidents.csv. If this config is meant to be consumed by scripts in this recipe, the mismatches could confuse users or break automation.recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-37-37 (1)
37-37:⚠️ Potential issue | 🟡 MinorMalformed sentence in
Create_Ticketdescription.
"Logs and routes tickets to departments (e.g., Field Ops) via the ticketing system. It (e.g., Jira/ServiceNow)."— the second sentence is a fragment with a dangling "It" and double space. This is part of the prompt the model will see.docs/tutorials/posts/noc-reasoning-agent.md-1-4 (1)
1-4:⚠️ Potential issue | 🟡 MinorIncorrect or backdated publication date.
date: 2025-09-05predates the PR creation (Feb 2026). If this is intentional for ordering purposes, a comment would help; otherwise, update to the actual publication date.recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-51-51 (1)
51-51:⚠️ Potential issue | 🟡 MinorTypo in tool name:
Check_Performancel()→Check_Performance().Extra trailing "l" in the "Degraded Prach" workflow at step 7. This text is injected into prompts and could confuse the model's tool-calling behavior.
docs/tutorials/posts/noc-reasoning-agent.md-352-365 (1)
352-365:⚠️ Potential issue | 🟡 MinorMissing expected evaluation results for tested models.
The tutorial says "You should see a meaningful improvement" but doesn't provide concrete accuracy numbers or expected ranges. Including representative results (e.g., baseline vs. fine-tuned close-code accuracy on a reference dataset) would make the tutorial more actionable and help users validate their setup. As per coding guidelines, "When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details".
recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-17-17 (1)
17-17:⚠️ Potential issue | 🟡 MinorStray character in tool description.
Line 17 ends with a dangling "I" after the period:
"...cell status). I". This will be seen by the model and may degrade prompt quality.✏️ Proposed fix
- - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status). I + - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status).docs/tutorials/posts/noc-reasoning-agent.md-229-244 (1)
229-244:⚠️ Potential issue | 🟡 MinorTypo: "indicence" → "incidence".
--log_dir=/workspace/prepare-sft-data-indicence(line 231) and the equivalent at line 296 both contain the same typo.✏️ Proposed fix
- --log_dir=/workspace/prepare-sft-data-indicence \ + --log_dir=/workspace/prepare-sft-data-incidence \docs/tutorials/posts/noc-reasoning-agent.md-96-118 (1)
96-118:⚠️ Potential issue | 🟡 MinorIn-place file overwriting risks data loss.
The second, third, and fourth
filter_rows.pycalls use the same path for both--input_csvand--output_csv(data/filtered_file.csv). If the script reads the CSV lazily or fails mid-write, data is lost with no backup. Consider using distinct intermediate filenames or documenting that the script fully reads before writing.recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-44-53 (1)
44-53:⚠️ Potential issue | 🟡 MinorMissing guard for display columns — will crash with
KeyErrorif columns are absent.The metrics section (lines 20–24) defensively checks which columns exist, but
columns_to_displayis accessed directly without validation. If the JSONL doesn't contain one of["expected_answer", "agent_response", "llm_judge_reason"], this will raise an unguardedKeyError.🛡️ Proposed fix
columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"] +available_display = [c for c in columns_to_display if c in df.columns] +if not available_display: + print("No display columns found; skipping sample output.") +else: -df_subset = df[columns_to_display].head(10) + df_subset = df[available_display].head(10)recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-58-62 (1)
58-62:⚠️ Potential issue | 🟡 MinorDebug prints left in production code.
Lines 59 (
print(row["response"])), 60 (# exit()), and 62 (print("in")) are debug artifacts that will produce excessive output and should be removed.recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-48-48 (1)
48-48:⚠️ Potential issue | 🟡 MinorClose code list contains both "Reset RU" and "RU Reset".
These appear to be duplicates with reversed word order. This could confuse the model during inference. Consider keeping only one canonical form.
recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-144-144 (1)
144-144:⚠️ Potential issue | 🟡 MinorDebug
print(required_incidents)left in.This prints the full list of required incident IDs to stdout, which is noisy and potentially leaks data in production pipelines.
recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-148-149 (1)
148-149:⚠️ Potential issue | 🟡 MinorMisleading success message.
Line 149 prints
"Evaluations saved to evaluations.jsonl"regardless of the actual output path. Use the actual path:🐛 Fix
-print("Evaluations saved to evaluations.jsonl") +print(f"Evaluations saved to {args.output_file}")recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-96-99 (1)
96-99:⚠️ Potential issue | 🟡 Minor
--previewargument is parsed but never used.The
previewargument (line 97) is accepted but has no effect. Either implement the preview functionality or remove the argument.As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/filtering/match_keywords.py-254-261 (1)
254-261:⚠️ Potential issue | 🟡 MinorUnused
patternparameter – compiled regex patterns are dead code.
find_keyword_matchesaccepts apatternargument (the compiled regex) but never uses it. The regex patterns compiled on lines 281-282 are passed intofind_keyword_matcheson lines 290 and 296 but silently discarded. The function only performs an exactincheck against thekeywordslist.Either remove the
patternparameter and the regex compilation, or refactor the function to actually use the regex for substring matching (which would change semantics — matching keywords within a field vs. exact match).Option A: Remove unused pattern (keeps current exact-match behavior)
-def find_keyword_matches(row, pattern, keywords): +def find_keyword_matches(row, keywords): """Finds which specific keywords from a list match within a DataFrame row.""" # Synthetic schema uses resolution_method (same meaning as close_code) resolution_method = row.get("resolution_method", row.get("close_code", "")) if resolution_method and str(resolution_method).lower().replace(" ", "") in keywords: return True return FalseAnd update the call sites:
- hw_pattern = re.compile("|".join(re.escape(k) for k in hw_keywords), flags=re.IGNORECASE) - sw_pattern = re.compile("|".join(re.escape(k) for k in sw_keywords), flags=re.IGNORECASE) - ... - hw_matches = find_keyword_matches(row, hw_pattern, hw_keywords) + hw_matches = find_keyword_matches(row, hw_keywords) ... - sw_matches = find_keyword_matches(row, sw_pattern, sw_keywords) + sw_matches = find_keyword_matches(row, sw_keywords)Also applies to: 280-296
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-46-46 (1)
46-46:⚠️ Potential issue | 🟡 MinorMissing
required=Trueon--inputargument.
--inputhas no default and norequired=True, soargs.inputwill beNoneif omitted, causing aTypeErroronPath(args.input)at line 59.- parser.add_argument("--input", help="Path to the JSONL file containing inputs and answers") + parser.add_argument("--input", required=True, help="Path to the JSONL file containing inputs and answers")As per coding guidelines, "the code should fail if a required argument is not specified."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py-126-126 (1)
126-126:⚠️ Potential issue | 🟡 Minor
--limitargument lackstype=int, will fail with a confusing error on non-numeric input.
args.limitis a string by default. Line 141 doesint(args.limit), which raisesValueErrorfor non-numeric input. Declare the type on the argument:- parser.add_argument("--limit", required=False) + parser.add_argument("--limit", type=int, required=False, default=None)recipes/noc-reasoning-agent/configs/noc_reasoning_sft.yaml-58-58 (1)
58-58:⚠️ Potential issue | 🟡 MinorComment contradicts the value:
remove_no_think_tags: falsebut comment says "Enabled, as requested".The value is
false(disabled), but the comment reads "Enabled, as requested". This is confusing and likely a copy-paste error. If the filter should be enabled, change totrue; otherwise, fix the comment.- remove_no_think_tags: false # Enabled, as requested + remove_no_think_tags: true # Enabled, as requestedrecipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-180-188 (1)
180-188:⚠️ Potential issue | 🟡 MinorInvalid
filter_typesilently does nothing.If the user passes
--filter_type=foo, none of theelifbranches match and the script exits silently with no output. Per coding guidelines, unsupported arguments should cause a failure.Suggested fix
elif args.filter_type == "close_codes": filter_close_codes(args.input_csv, args.output_csv) + else: + parser.error(f"Unknown filter_type: {args.filter_type!r}. Choose from: auto, soft_solve, problem_codes, close_codes")Or better, use
choiceson the argument itself:- parser.add_argument("--filter_type", type=str, default="auto") + parser.add_argument("--filter_type", type=str, default="auto", + choices=["auto", "soft_solve", "problem_codes", "close_codes"])As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-22-52 (1)
22-52:⚠️ Potential issue | 🟡 Minor
get_toolshas an inconsistent return type: tuple(None, None)vs. dict.On the error path (line 50), the function returns
(None, None)(a 2-tuple), but on the success path (line 52), it returnsmatches(a dict). The caller on line 73 assigns the result to a single variable and compares with== (None, None), which works incidentally but is fragile and confusing.Return
Noneon the empty case and checkif matches is Noneat the call site.Suggested fix
if not matches: - # print("No tools!") - return None, None - # print(matches) + return None return matchesAnd update the caller:
- matches = get_tools(d["initial_background"]) - if matches == (None, None): + matches = get_tools(d["initial_background"]) + if matches is None:recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-130-130 (1)
130-130:⚠️ Potential issue | 🟡 MinorTypo in user-facing message: "disgarded" → "discarded".
- print(f"{incorrect_incidents} incidents were not parsed correctly and disgarded.") + print(f"{incorrect_incidents} incidents were not parsed correctly and discarded.")recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-46-52 (1)
46-52:⚠️ Potential issue | 🟡 MinorOff-by-one in
end_indexcheck:]-not-found case passes the guard.When
"]"is absent,rfindreturns-1, soend_indexbecomes0after the+ 1. The guardend_index != -1on line 52 evaluates toTrue, allowing a nonsensical slicetext[start_index:0](empty string) that falls through to aJSONDecodeErrorinstead of the intended error message on line 61.Proposed fix
- end_index = text.rfind("]") + 1 + end_index_raw = text.rfind("]") + end_index = end_index_raw + 1 - if start_index != -1 and end_index != -1: + if start_index != -1 and end_index_raw != -1:recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-505-511 (1)
505-511:⚠️ Potential issue | 🟡 MinorMissing validation:
output_dirandoutput_filecan beNonefor their respective modes.When
parse_types == "compile_reasoning",output_diris passed tocompile_reasoningwhich callsos.makedirs(output_dir, ...)— this crashes withTypeErrorifNone. Similarly,output_filecan beNoneforsteps_extraction. Add upfront validation like the existing check forreasoning_jsonl.Proposed fix
def main(jsonl_file, input_file, output_file, parse_types, reasoning_jsonl=None, output_dir=None): if parse_types == "steps_extraction": + if not output_file: + raise ValueError("Please specify an output file by specifying --output") prepare_data_for_reasoning_traces(jsonl_file, input_file, output_file) elif parse_types == "compile_reasoning": if not reasoning_jsonl: raise ValueError("Please specify a reasoning jsonl file by specifying --reasoning_jsonl") + if not output_dir: + raise ValueError("Please specify an output directory by specifying --output_dir") compile_reasoning(jsonl_file, input_file, output_dir, reasoning_jsonl)
| skip = True | ||
| if skip and os.path.exists(output_path): | ||
| os.remove(output_path) | ||
| processed_indices = load_processed_indices(output_path) |
There was a problem hiding this comment.
Hardcoded skip = True deletes the output file on every run, completely breaking resume logic.
Lines 132-134 unconditionally delete the output file before load_processed_indices is called on line 135. This means the resume logic (lines 57-71, 144-148) will never find any previously processed rows — the file is always gone. All the resume infrastructure becomes dead code.
This looks like a debug artifact that was accidentally left in.
- skip = True
- if skip and os.path.exists(output_path):
- os.remove(output_path)
processed_indices = load_processed_indices(output_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| skip = True | |
| if skip and os.path.exists(output_path): | |
| os.remove(output_path) | |
| processed_indices = load_processed_indices(output_path) | |
| processed_indices = load_processed_indices(output_path) |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around
lines 132 - 135, The code currently hardcodes skip = True which causes deletion
of output_path every run and breaks resume logic; change this so skip is not
hardcoded—make it a CLI flag, environment variable, or default False—and only
call os.remove(output_path) when skip is explicitly set by the user; ensure the
block around skip, os.path.exists(output_path), and os.remove(output_path) is
gated by the configurable skip variable and leave
load_processed_indices(output_path) unchanged so resume logic
(load_processed_indices / processed_indices) can work when skip is False.
| model = AutoModelForCausalLM.from_pretrained( | ||
| args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and read the batch variant file
fd "create_agent_with_tools_batch.py" --type fRepository: NVIDIA-NeMo/Skills
Length of output: 130
🏁 Script executed:
# Find the non-batch variant for comparison
fd "create_agent_with_tools.py" --type fRepository: NVIDIA-NeMo/Skills
Length of output: 124
🌐 Web query:
AutoModelForCausalLM.from_pretrained torch_dtype parameter transformers library
💡 Result:
AutoModelForCausalLM.from_pretrained(..., torch_dtype=...) (Hugging Face Transformers) controls the PyTorch floating-point dtype the model is instantiated/loaded into, which can significantly reduce memory usage for large LMs.
Common values:
- A specific dtype (e.g.
torch.float16,torch.bfloat16,torch.float32) to force loading in that dtype. Example:AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16).[1] "auto"to automatically choose a dtype based on the checkpoint/model metadata (the docs describe it as deriving dtype automatically, commonly from the weights / config where available). Example:AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto").[2]
Notes:
- This is only available for floating dtypes (per the model docs’ “Model Instantiation dtype” section).[1]
- It’s often paired with
device_map="auto"when loading big models to fit across available devices.[2]
Sources:
[1] Transformers “Models” / “Model Instantiation dtype” documentation (examples using torch_dtype=torch.float16 and "auto", and noting floating dtypes only). (huggingface.co)
[2] Transformers “Loading models” guide showing AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto"). (huggingface.co)
Citations:
- 1: https://huggingface.co/docs/transformers/v4.15.0/en/main_classes/model?utm_source=openai
- 2: https://huggingface.co/docs/transformers/v4.53.1/models?utm_source=openai
🏁 Script executed:
cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py | sed -n '150,165p'Repository: NVIDIA-NeMo/Skills
Length of output: 921
🏁 Script executed:
cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py | sed -n '70,80p'Repository: NVIDIA-NeMo/Skills
Length of output: 499
Incorrect parameter name: dtype= should be torch_dtype=.
AutoModelForCausalLM.from_pretrained() uses torch_dtype as the parameter name, not dtype. Using dtype will silently ignore the parameter, and the model will load in the default precision (float32) instead of the intended bfloat16/float16, causing significant memory and performance degradation.
The non-batch variant (create_agent_with_tools.py line 74) correctly uses torch_dtype=torch.float16.
Fix
model = AutoModelForCausalLM.from_pretrained(
- args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True
+ args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model = AutoModelForCausalLM.from_pretrained( | |
| args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True | |
| model = AutoModelForCausalLM.from_pretrained( | |
| args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True | |
| ) |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around
lines 156 - 157, The call to AutoModelForCausalLM.from_pretrained in
create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so
the intended precision is ignored; update the call to pass the dtype via
torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified
bfloat16/float16 precision, ensuring you modify the
AutoModelForCausalLM.from_pretrained(...) invocation accordingly.
| # Parse arguments for input JSONL path | ||
| parser = argparse.ArgumentParser(description="Evaluation Pipeline for Agent Responses") | ||
| parser.add_argument("input_jsonl", help="Path to agent_responses.jsonl containing expected_answer and agent_response") | ||
| parser.add_argument("--output_file", help="Path to output") |
There was a problem hiding this comment.
--output_file is neither required nor has a default — will crash at line 148.
If --output_file is omitted, args.output_file is None, and pd.DataFrame.to_json(None, ...) will raise a ValueError. Add required=True or a sensible default.
🐛 Proposed fix
- parser.add_argument("--output_file", help="Path to output")
+ parser.add_argument("--output_file", required=True, help="Path to output JSONL")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parser.add_argument("--output_file", help="Path to output") | |
| parser.add_argument("--output_file", required=True, help="Path to output JSONL") |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py` at
line 12, The parser.add_argument("--output_file", help="Path to output") can
leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file,
...) to raise a ValueError; update the argument declaration
(parser.add_argument) to either set required=True or provide a sensible default
(e.g., default="evaluation_results.json"), and/or add a guard before writing
that validates args.output_file is not None and raises a clear error message;
ensure references to args.output_file and the DataFrame.to_json call are updated
accordingly.
| m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL) | ||
|
|
||
| if not m: | ||
| # Handle case with no arguments, e.g., Check_Time[] | ||
| m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string)) | ||
| if m_no_args: | ||
| tool_name = m_no_args.group(1) | ||
| raw_args = "" | ||
| else: | ||
| # If it doesn't match the syntax, return original string or raise error | ||
| # returning string allows the LLM to fail gracefully or retry | ||
| return string | ||
|
|
||
| tool_name, raw_args = m.groups() |
There was a problem hiding this comment.
Unreachable branch + latent crash: m.groups() on None if the dead-code path were taken.
The first regex \[(.*)\] with re.DOTALL already matches empty brackets Tool[], making the m_no_args branch (lines 157-160) dead code. More critically, if it were reached (e.g., due to a future regex change), execution falls through to line 166 which calls m.groups() on None — an AttributeError.
Restructure to eliminate the dead path and the latent crash:
Proposed fix
m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL)
if not m:
- # Handle case with no arguments, e.g., Check_Time[]
- m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string))
- if m_no_args:
- tool_name = m_no_args.group(1)
- raw_args = ""
- else:
- # If it doesn't match the syntax, return original string or raise error
- # returning string allows the LLM to fail gracefully or retry
- return string
-
- tool_name, raw_args = m.groups()
+ return string
+
+ tool_name, raw_args = m.groups()🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py` around
lines 153 - 166, The code currently uses two regex matches (m and m_no_args) and
then unconditionally calls m.groups(), which creates dead code and a latent
crash; instead, after performing the initial match m =
re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is
truthy and if not immediately return the original string (or handle the error),
and only then call m.groups() to assign tool_name and raw_args; remove the
separate m_no_args branch entirely so empty brackets are handled by the first
regex and m.groups() is always called on a valid match.
| if __name__ == "__main__": | ||
| # === Configuration === | ||
| # csv_file = "data/top_10_categorized_incidents.csv" # Path to your CSV | ||
| csv_file = "data/prompt_incident.yaml" | ||
| column_name = "action_chronicle" # default; overridden after loading data | ||
| model = "openai/gpt-oss-120b" # Tokenizer model | ||
| output_csv = "outputs/work_notes_token_lengths.csv" | ||
| output_plot = "outputs/work_notes_token_hist.png" | ||
|
|
||
| # === Load Tokenizer === | ||
| tokenizer = AutoTokenizer.from_pretrained(model) | ||
| print(f"[info] Tokenizer loaded for model: {model}") | ||
|
|
||
| if csv_file.endswith((".yaml", ".yml")): | ||
| with open(csv_file, "r", encoding="utf-8") as f: | ||
| yaml_data = yaml.safe_load(f) | ||
|
|
||
| # You might want to join everything into one big string, or tokenize each value | ||
| if isinstance(yaml_data, dict): | ||
| prompt_text = yaml.dump(yaml_data) | ||
| else: | ||
| prompt_text = str(yaml_data) | ||
|
|
||
| token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False)) | ||
| print(f"[info] YAML file {csv_file} contains {token_count} tokens.") | ||
| exit(0) | ||
|
|
||
| # === Load CSV === | ||
| df = pd.read_csv(csv_file, encoding="utf-8") | ||
| if column_name not in df.columns: | ||
| raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}") | ||
|
|
||
| # === Count Tokens === | ||
| token_lengths = [] | ||
| zero_tokens = 0 | ||
| for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"): | ||
| if pd.isna(text): # Handle NaN values | ||
| token_lengths.append(0) | ||
| zero_tokens += 1 | ||
| continue | ||
| # Count tokens for this row | ||
| token_count = len(tokenizer.encode(str(text), add_special_tokens=False)) | ||
| token_lengths.append(token_count) | ||
|
|
||
| # Count rows with 0 tokens | ||
| print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}") | ||
|
|
||
| # Add results back to DataFrame | ||
| df["work_notes_token_length"] = token_lengths | ||
|
|
||
| # === Save CSV with token counts === | ||
| df.to_csv(output_csv, index=False) | ||
| print(f"[info] Token lengths saved to {output_csv}") | ||
|
|
||
| # === Summary Statistics === | ||
| print("\n=== Token Length Summary ===") | ||
| print(df["work_notes_token_length"].describe()) | ||
|
|
||
| # === Plot Histogram === | ||
| plt.figure(figsize=(10, 6)) | ||
| plt.hist(df["work_notes_token_length"], bins=50) | ||
| plt.title(f"Distribution of Token Lengths for {column_name}") | ||
| plt.xlabel("Token Length") | ||
| plt.ylabel("Frequency") | ||
| plt.tight_layout() | ||
| plt.savefig(output_plot, dpi=150) | ||
| plt.close() | ||
| print(f"[info] Histogram saved to {output_plot}") | ||
| exit() | ||
| main() |
There was a problem hiding this comment.
main() is unreachable — debug code left in __main__ block.
exit() on line 275 terminates the process before main() on line 276 is ever called. The entire __main__ block (lines 207–275) appears to be leftover development code with hardcoded paths (data/prompt_incident.yaml, openai/gpt-oss-120b) and duplicated logic. It also bypasses the defensive AutoTokenizer import guard — line 217 will crash with TypeError if transformers is not installed.
Replace the block with a clean entry point:
🐛 Proposed fix
if __name__ == "__main__":
- # === Configuration ===
- # csv_file = "data/top_10_categorized_incidents.csv" # Path to your CSV
- csv_file = "data/prompt_incident.yaml"
- column_name = "action_chronicle" # default; overridden after loading data
- model = "openai/gpt-oss-120b" # Tokenizer model
- output_csv = "outputs/work_notes_token_lengths.csv"
- output_plot = "outputs/work_notes_token_hist.png"
-
- # === Load Tokenizer ===
- tokenizer = AutoTokenizer.from_pretrained(model)
- print(f"[info] Tokenizer loaded for model: {model}")
-
- if csv_file.endswith((".yaml", ".yml")):
- with open(csv_file, "r", encoding="utf-8") as f:
- yaml_data = yaml.safe_load(f)
-
- # You might want to join everything into one big string, or tokenize each value
- if isinstance(yaml_data, dict):
- prompt_text = yaml.dump(yaml_data)
- else:
- prompt_text = str(yaml_data)
-
- token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False))
- print(f"[info] YAML file {csv_file} contains {token_count} tokens.")
- exit(0)
-
- # === Load CSV ===
- df = pd.read_csv(csv_file, encoding="utf-8")
- if column_name not in df.columns:
- raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}")
-
- # === Count Tokens ===
- token_lengths = []
- zero_tokens = 0
- for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"):
- if pd.isna(text): # Handle NaN values
- token_lengths.append(0)
- zero_tokens += 1
- continue
- # Count tokens for this row
- token_count = len(tokenizer.encode(str(text), add_special_tokens=False))
- token_lengths.append(token_count)
-
- # Count rows with 0 tokens
- print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}")
-
- # Add results back to DataFrame
- df["work_notes_token_length"] = token_lengths
-
- # === Save CSV with token counts ===
- df.to_csv(output_csv, index=False)
- print(f"[info] Token lengths saved to {output_csv}")
-
- # === Summary Statistics ===
- print("\n=== Token Length Summary ===")
- print(df["work_notes_token_length"].describe())
-
- # === Plot Histogram ===
- plt.figure(figsize=(10, 6))
- plt.hist(df["work_notes_token_length"], bins=50)
- plt.title(f"Distribution of Token Lengths for {column_name}")
- plt.xlabel("Token Length")
- plt.ylabel("Frequency")
- plt.tight_layout()
- plt.savefig(output_plot, dpi=150)
- plt.close()
- print(f"[info] Histogram saved to {output_plot}")
- exit()
- main()
+ main()🧰 Tools
🪛 Ruff (0.15.0)
[warning] 237-237: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 242-242: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py` around lines 207 -
276, The __main__ block contains leftover debug logic that hardcodes
csv_file/model, calls AutoTokenizer directly (unsafe if transformers missing),
and calls exit() before main(), making main() unreachable; remove or replace the
entire if __name__ == "__main__": block so it simply parses CLI/default args and
invokes main(), ensure any tokenizer creation uses the existing guarded
import/creation logic (e.g., reuse the module's AutoTokenizer-safe factory or
wrap AutoTokenizer.from_pretrained in the same try/except used elsewhere),
eliminate hardcoded paths and stray exit() calls, and wire csv_file,
column_name, model, output_csv, output_plot into main() parameters or argparse
so the script runs via main() correctly.
| def parse_final_reasoning(generation_text): | ||
| """ | ||
| Parses the 'generation' field to extract only the final | ||
| Thought, Action, and Observation steps. | ||
| """ | ||
| # 1. Isolate the final, clean reasoning trace | ||
| # This regex looks for the block starting with "Question:" and ending with "Finish[...]" | ||
| trace_block_match = re.search( | ||
| r"Question:.*?(Finish\[.*?\])", | ||
| generation_text, | ||
| re.DOTALL, # Allows '.' to match newlines | ||
| ) | ||
|
|
||
| # Find the last occurrence of "Finish" | ||
| last_index = generation_text.rfind("Finish") | ||
|
|
||
| if last_index != -1: | ||
| finish_text = generation_text[last_index:].strip() | ||
| print(finish_text) | ||
| else: | ||
| print("No 'Finish' found!") | ||
|
|
||
| return finish_text | ||
|
|
||
| if not trace_block_match: | ||
| return "Final reasoning trace not found." | ||
|
|
||
| reasoning_trace = trace_block_match.group(0) | ||
|
|
||
| # 2. Extract each individual Thought, Action, and Observation | ||
| # This regex finds all lines starting with the keywords | ||
| step_pattern = re.compile( | ||
| r"^(Thought|Action|Observation)\s+\d+:\s*(.*)$", | ||
| re.MULTILINE, # Allows '^' to match the start of each line | ||
| ) | ||
|
|
||
| steps = step_pattern.findall(reasoning_trace) | ||
|
|
||
| # 3. Format the results for clarity | ||
| parsed_steps = [] | ||
| for step in steps: | ||
| kind = step[0] # "Thought", "Action", or "Observation" | ||
| content = step[1] # The text of the step | ||
| parsed_steps.append(f"**{kind}:** {content}") | ||
|
|
||
| return "\n".join(parsed_steps) |
There was a problem hiding this comment.
Dead code: lines 50–71 are unreachable after return on line 48.
parse_final_reasoning always returns at line 48 (or crashes if last_index == -1 due to the same unbound-variable pattern). Everything from line 50 onward is dead code. Additionally, lines 44 and 46 contain debug print statements that should be removed.
Either remove the dead code or restructure the function so the intended logic is reachable.
🤖 Prompt for AI Agents
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`
around lines 26 - 71, The function parse_final_reasoning contains an early
return and debug prints that make the intended parsing logic unreachable; remove
the premature return and the print statements and restructure the function to
(1) first find the final "Finish" block using trace_block_match or last_index as
a fallback, (2) set finish_text deterministically (return a clear error string
if neither is found), (3) run the step_pattern regex against that finish_text to
extract Thought/Action/Observation entries, format them into parsed_steps, and
finally return the joined string; refer to parse_final_reasoning,
trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when
making these changes.
| def find_finish_action(generation_text): | ||
| # Regex to capture everything inside Finish[ ... ] | ||
| # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL) | ||
|
|
||
| # finish_action = match.group(1).strip() | ||
| # return finish_action | ||
| last_index = generation_text.rfind("Finish") | ||
|
|
||
| if last_index != -1: | ||
| finish_text = generation_text[last_index:].strip() | ||
| # print(finish_text) | ||
| else: | ||
| print("No 'Finish' found!") | ||
|
|
||
| return finish_text |
There was a problem hiding this comment.
UnboundLocalError when "Finish" is not found.
When generation_text doesn't contain "Finish", finish_text is never assigned, but line 88 returns it unconditionally. This will crash at runtime.
🐛 Proposed fix
def find_finish_action(generation_text):
- # Regex to capture everything inside Finish[ ... ]
- # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL)
-
- # finish_action = match.group(1).strip()
- # return finish_action
last_index = generation_text.rfind("Finish")
-
if last_index != -1:
- finish_text = generation_text[last_index:].strip()
- # print(finish_text)
+ return generation_text[last_index:].strip()
else:
- print("No 'Finish' found!")
-
- return finish_text
+ return ""🤖 Prompt for AI Agents
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`
around lines 74 - 88, The function find_finish_action assigns finish_text only
when last_index != -1 but always returns finish_text, causing UnboundLocalError
when "Finish" isn't found; update find_finish_action to handle the not-found
case by returning a safe default (e.g., None or empty string) or raising a clear
exception, and ensure finish_text is defined in both branches (use last_index,
generation_text, and finish_text to locate/construct the return value or the
fallback).
gwarmstrong
left a comment
There was a problem hiding this comment.
Thanks for putting this together! A handful of things to fix to match our existing tutorial conventions.
| .codex | ||
|
|
||
| .idea | ||
| site/ |
There was a problem hiding this comment.
Looks like site/ got added 5 times here — just needs the one entry.
| --- | ||
| date: 2025-09-05 | ||
| readtime: 30 | ||
| --- |
There was a problem hiding this comment.
Our other tutorials all hide the table of contents. This is a pretty long tutorial so it may make sense to have it anyway. However maybe consider if all of your heading levels are correct. For example, "Phase 1:" and "Parse and Format Steps" are the same heading level but it looks disjoint because there's a Phase 1 and Phase 2 that are more than likely the same level.
If you decide to remove the table of contents altogether, you can add:
hide:
- toc|
|
||
| ```shell | ||
| pip install git+https://github.com/NVIDIA-NeMo/Skills.git | ||
| ns setup |
There was a problem hiding this comment.
Nit: our convention is ```bash for shell commands (not ```shell). Applies to all ~16 code blocks in the tutorial.
| ``` | ||
|
|
||
| When prompted during `ns setup`, define a working folder as `/workspace` (e.g. mount `/your/project/dir:/workspace`). This folder will be used in all subsequent commands. For more details, see the [Nemo-Skills configs](https://nvidia-nemo.github.io/Skills/basics/cluster-configs/) documentation. | ||
|
|
There was a problem hiding this comment.
Links to our own docs should use relative paths so they work with mkdocs serve and don't break if the site URL changes. Applies in a few places:
[Nemo-Skills configs](../../basics/cluster-configs.md)[generation](../../pipelines/generation.md)[Nemo-Skills training](../../pipelines/training.md)[start-server pipeline](../../pipelines/start-server.md)[Python API](../../pipelines/index.md#python-interface)
| # Teaching a Model to Reason Over Telecom Network Incidents | ||
|
|
||
| This tutorial walks you through a complete pipeline for fine-tuning a reasoning model that can autonomously diagnose and resolve telecom network incidents. Using [Nemo-Skills](https://nvidia-nemo.github.io/Skills/) together with a [NoC Reasoning Agent](https://github.com/aiden200/NoC_Reasoning_Agent), we will take [Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) and teach it to perform step-by-step root-cause analysis with tool-calling — the same workflow a human NOC (Network Operations Center) engineer follows today. | ||
|
|
There was a problem hiding this comment.
External links (HuggingFace, GitHub repos, etc.) should open in a new tab. Just append {target="_blank"} — e.g.:
[Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B){target="_blank"}Applies to all external links: Qwen3-32B, gpt-oss-120b, NeMo-RL, NoC Reasoning Agent repo, etc.
| --log_dir=/workspace/prepare-sft-data-indicence \ | ||
| --expname=prep-sft-data-inci \ | ||
| --run_after=solution-generation \ | ||
| --cluster=local \ |
There was a problem hiding this comment.
The ns run_cmd blocks for preparing SFT data and test data both use --run_after=solution-generation, but there's no prior command with --expname=solution-generation. The ns generate commands use --expname=gpt-oss-sdg-with-python.
|
|
||
| ```shell | ||
| ns run_cmd \ | ||
| --log_dir=/workspace/prepare-sft-data-indicence \ |
There was a problem hiding this comment.
Small typo — "indicence", I assume it should be "incidence" or "incidents". Shows up in both the SFT data and test data --log_dir values (lines 231, 296).
| @@ -0,0 +1,214 @@ | |||
| # Byte-compiled / optimized / DLL files | |||
There was a problem hiding this comment.
This is 214 lines and looks like the full GitHub Python .gitignore template — includes entries for Django, Flask, Scrapy, SageMath, Celery, Marimo, etc. The root repo already has a comprehensive .gitignore.
Was this intentionally included for the recipe, or did it get auto-generated? I would think this could get entirely removed.
|
|
||
| ```shell | ||
| ns run_cmd --expname=prepare-noc --log_dir=/workspace/prepare-noc --cluster=local \ | ||
| 'cd /workspace && \ |
There was a problem hiding this comment.
The setup downloads 16 files via wget and copies them into /workspace. But since these are all read-only (scripts, prompts, configs), do they actually need to be copied? The recipe files already live in the repo under recipes/noc-reasoning-agent/, so the tutorial commands could just reference them directly from there.
If the user has nemo-skills cloned, these files are already at /nemo_run/code/recipes/noc-reasoning-agent/ inside the container (see code packaging docs). So the ns generate commands could point to e.g. ++prompt_config=/nemo_run/code/recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml without any setup step at all.
For the case where someone wants to customize a prompt, the prompt format docs already explain how ++prompt_config= works with custom YAML files — maybe just link to that instead of having the user copy everything upfront.
This would eliminate the entire ns run_cmd setup block and make the tutorial much easier to follow.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 286-365: Update the "Evaluation" section to include concrete
expected metrics and any dataset-specific notes: add a short table or bullet
stating observed baseline vs fine-tuned close-code accuracy (e.g., "baseline X%
vs fine-tuned Y% on outputs/final_agent_input.jsonl"), mention which evaluation
script is used (problem_code_evaluation.py) and which output files correspond to
each run (outputs/agent_responses.jsonl and
outputs/baseline_agent_responses.jsonl), and note any special preparation or
non-standard inference flags required (e.g., tokenizer, filters.trim_solutions,
or custom weights_dir) so readers can reproduce and validate the reported
numbers.
| ## Evaluation | ||
|
|
||
| To evaluate the fine-tuned model, we use a ReAct (Reasoning + Acting) agent that calls NOC tools at each step, then compare its incident resolution accuracy against the baseline model. | ||
|
|
||
| ### Prepare Test Data | ||
|
|
||
| Prepare the test set in the same format as training: | ||
|
|
||
| ```shell | ||
| ns run_cmd \ | ||
| --log_dir=/workspace/prepare-test-data-indicence \ | ||
| --expname=prep-test-data-inci \ | ||
| --run_after=solution-generation \ | ||
| --cluster=local \ | ||
| 'python -m nemo_skills.training.prepare_data \ | ||
| --config-path /workspace/data \ | ||
| --config-name noc_reasoning_sft \ | ||
| input_files=/workspace/outputs/testing_data_split.jsonl \ | ||
| output_path=/workspace/outputs/sft-test-incidence.jsonl \ | ||
| prompt_config=/workspace/data/prompts/prompt_incident.yaml \ | ||
| tokenizer=Qwen/Qwen3-32B \ | ||
| filters.remove_contaminated=false \ | ||
| add_unlabeled=true \ | ||
| filters.trim_solutions=false' | ||
| ``` | ||
|
|
||
| ### Build Agent Input | ||
|
|
||
| Create the ReAct agent input file containing incident prompts with tool response data: | ||
|
|
||
| ```shell | ||
| python src/ns_pipelines/prepare_react_agent.py \ | ||
| outputs/testing_data_split.jsonl \ | ||
| outputs/sft-test-incidence.jsonl \ | ||
| --output outputs/final_agent_input.jsonl \ | ||
| --prompt_config data/prompts/prompt_incident.yaml | ||
| ``` | ||
|
|
||
| ### Install Agent Dependencies | ||
|
|
||
| Install the additional libraries needed for the ReAct agent: | ||
|
|
||
| ```shell | ||
| pip install --upgrade langgraph langchain langchain-huggingface transformers torch accelerate pandas | ||
| ``` | ||
|
|
||
| ### Run the Fine-Tuned Agent | ||
|
|
||
| ```shell | ||
| PYTHONPATH=$PWD python src/create_agent_with_tools_batch.py \ | ||
| --input outputs/final_agent_input.jsonl \ | ||
| --output outputs/agent_responses.jsonl \ | ||
| --weights_dir training/qwen3-32b-improved-hf | ||
| ``` | ||
|
|
||
| ### Run the Baseline Agent | ||
|
|
||
| For comparison, run the same evaluation using the original (non-fine-tuned) model: | ||
|
|
||
| ```shell | ||
| PYTHONPATH=$PWD python src/create_agent_with_tools_batch.py \ | ||
| --input outputs/final_agent_input.jsonl \ | ||
| --output outputs/baseline_agent_responses.jsonl \ | ||
| --weights_dir Qwen/Qwen3-32B | ||
| ``` | ||
|
|
||
| ### Compare Results | ||
|
|
||
| Evaluate both models by computing close-code accuracy (how often the model selects the correct resolution method): | ||
|
|
||
| ```shell | ||
| # Fine-tuned model | ||
| python src/evaluation/problem_code_evaluation.py outputs/agent_responses.jsonl | ||
|
|
||
| # Baseline model | ||
| python src/evaluation/problem_code_evaluation.py outputs/baseline_agent_responses.jsonl | ||
| ``` | ||
|
|
||
| The evaluation script matches the model's predicted close code against the expected answer using synonym-aware matching (e.g. "Resolved" and "Issue Corrected" are both recognized). You should see a meaningful improvement in the fine-tuned model's accuracy compared to the baseline. | ||
|
|
There was a problem hiding this comment.
Add concrete expected results for the evaluation section.
The guideline requires expected results for tested models. Please include target/observed metrics (e.g., baseline close‑code accuracy vs. fine‑tuned accuracy on the provided dataset) so readers can validate runs. As per coding guidelines, “When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details like special preparation arguments or non-standard inference arguments.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 286 - 365, Update
the "Evaluation" section to include concrete expected metrics and any
dataset-specific notes: add a short table or bullet stating observed baseline vs
fine-tuned close-code accuracy (e.g., "baseline X% vs fine-tuned Y% on
outputs/final_agent_input.jsonl"), mention which evaluation script is used
(problem_code_evaluation.py) and which output files correspond to each run
(outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl), and
note any special preparation or non-standard inference flags required (e.g.,
tokenizer, filters.trim_solutions, or custom weights_dir) so readers can
reproduce and validate the reported numbers.
…ripts Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
91a3fe6 to
d11100c
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (4)
recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py (4)
82-82:--limitshould usetype=intin argparse.
--limitis declared without a type, so it arrives as a string. Line 150 manually casts withint(args.limit), which will raise an unhelpfulValueErroron non-numeric input. Usingtype=intin argparse gives the user a proper error message at parse time.Proposed fix
- parser.add_argument("--limit", required=False, help="Max number of rows to export") + parser.add_argument("--limit", required=False, type=int, help="Max number of rows to export")Then simplify the usage:
- if args.limit is not None and int(args.limit) > 0: - df = df.head(int(args.limit)) + if args.limit is not None and args.limit > 0: + df = df.head(args.limit)Also applies to: 149-151
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` at line 82, Change the argparse declaration for --limit to accept integers by adding type=int (and optionally default=None) so argparse validates numeric input; then remove the manual cast int(args.limit) and use args.limit directly where used (e.g., in the code that slices or limits rows), updating references in this module (create_input_jsonl_from_incidents.py) to use the parsed integer value instead of re-casting a string.
112-118: Broadexcept Exceptionon datetime parsing masks real errors.While datetime parsing on messy data can legitimately fail, catching all
Exceptiontypes is too broad (e.g., it would hideMemoryErrororKeyboardInterrupt). Consider catching(ValueError, TypeError)specifically, and logging the exception type. Ruff also flags this (BLE001).Proposed fix
- except Exception as e: - print(f"Failed to parse datetime columns {start_col}, {end_col}: {e}") + except (ValueError, TypeError, pd.errors.ParserError) as e: + print(f"WARNING: Failed to parse datetime columns {start_col}, {end_col}: {e}", file=sys.stderr)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 112 - 118, The try/except around the datetime conversion is too broad and should not catch all Exception types; update the block that converts df[start_col] and df[end_col] and computes df["time_to_resolve"] (in create_input_jsonl_from_incidents.py) to catch only ValueError and TypeError, and ensure the logged message includes the exception type and message (instead of a bare except Exception). Keep the fallback that sets df["time_to_resolve"] = 0 on those parse errors and preserve the existing log text/context while adding the exception type information.
69-75: Missing type hints on function parameters.Per coding guidelines, use type hints for simple types. Both
dfandnumhave clear types.Proposed fix
-def extract_examples_by_problem_code(df, num): +def extract_examples_by_problem_code(df: pd.DataFrame, num: int) -> pd.DataFrame:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 69 - 75, The function extract_examples_by_problem_code should declare types: change its signature to accept df: pandas.DataFrame (pd.DataFrame) and num: int and return pd.DataFrame; update the signature of extract_examples_by_problem_code to include these type hints and add the return annotation -> pd.DataFrame so linters and type checkers know the expected types (ensure pd is imported from pandas where the file already uses it).
9-14:try/except ImportErrorfor sibling module resolution is fragile.This pattern silently falls back to
src.utils.*when a relative import fails, coupling the script to a specific directory layout. Consider documenting the expectedPYTHONPATHsetup or using a single import style.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 9 - 14, The try/except ImportError fall-back between relative and absolute imports is fragile; replace the dual import block with a single, explicit import style (e.g., use absolute imports or a consistent relative import) for get_reasoning_process_for_fault_category, INCIDENT_ID_COLUMN, and REQUIRED_COLUMNS so the module resolution is deterministic, and add a short comment documenting the expected PYTHONPATH/packaging if you choose absolute imports or update to a proper package-relative import if this file is intended to be part of the package.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 96-119: The tutorial currently runs filter_rows.py with the same
path for --input_csv and --output_csv (data/filtered_file.csv) for filter_type
values soft_solve and problem_codes, which risks corrupting the input if the
script fails mid-write; update the example commands to use distinct intermediate
output filenames (e.g., data/filtered_soft_solve.csv,
data/filtered_problem_codes.csv) or explicitly document the risk and recommend
using temporary files and atomic moves, referencing the filter_rows.py
invocation and the --input_csv/--output_csv parameters so readers can spot and
change the commands.
- Around line 233-234: Fix the typo in the log directory flags used in the
preparation commands: replace occurrences of
"--log_dir=/workspace/prepare-sft-data-indicence" and
"--log_dir=/workspace/prepare-test-data-indicence" with
"--log_dir=/workspace/prepare-sft-data-incidence" and
"--log_dir=/workspace/prepare-test-data-incidence" respectively so both commands
use "incidence" instead of "indicence".
- Around line 232-246: The flag --run_after=solution-generation references a
non-existent experiment and should be removed or replaced with a valid
experiment name; update the ns run_cmd invocation that contains
--run_after=solution-generation to either delete the --run_after argument or set
it to an existing experiment such as --run_after=gpt-oss-sdg-with-python or
--run_after=prepare-noc so the pipeline will correctly sequence (look for the ns
run_cmd block and the --run_after flag in the noc-reasoning-agent tutorial).
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 59-66: The broad try/except in coerce_to_str is unnecessary and
hides real errors; remove the try/except and directly return str(value) after
the pd.isna check so any genuine exceptions (e.g., a broken __str__) will
propagate; update the function coerce_to_str to simply check pd.isna(value) then
return str(value) without catching Exception.
- Around line 142-147: Remove the stray debug print that dumps
required_incidents to stdout: in the block that reads args.required_incidents
(the code that opens args.required_incidents, populates required_incidents, and
then filters df by id_col), delete the print(required_incidents) call (or
replace it with a debug-level logger call if you need it for troubleshooting) so
large lists are not printed to stdout before filtering.
---
Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 354-366: Update the "Compare Results" section to include concrete
expected accuracy numbers and dataset details: mention approximate baseline vs
fine-tuned close-code accuracy (e.g., "Baseline ~X% vs Fine-tuned ~Y% on the
sample dataset"), note any synonym matching behavior of
src/evaluation/problem_code_evaluation.py, and provide any special dataset prep
or inference args needed to reproduce the numbers; reference the example
commands that run src/evaluation/problem_code_evaluation.py on
outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl so
readers can validate their runs against the stated expected metrics.
---
Nitpick comments:
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Line 82: Change the argparse declaration for --limit to accept integers by
adding type=int (and optionally default=None) so argparse validates numeric
input; then remove the manual cast int(args.limit) and use args.limit directly
where used (e.g., in the code that slices or limits rows), updating references
in this module (create_input_jsonl_from_incidents.py) to use the parsed integer
value instead of re-casting a string.
- Around line 112-118: The try/except around the datetime conversion is too
broad and should not catch all Exception types; update the block that converts
df[start_col] and df[end_col] and computes df["time_to_resolve"] (in
create_input_jsonl_from_incidents.py) to catch only ValueError and TypeError,
and ensure the logged message includes the exception type and message (instead
of a bare except Exception). Keep the fallback that sets df["time_to_resolve"] =
0 on those parse errors and preserve the existing log text/context while adding
the exception type information.
- Around line 69-75: The function extract_examples_by_problem_code should
declare types: change its signature to accept df: pandas.DataFrame
(pd.DataFrame) and num: int and return pd.DataFrame; update the signature of
extract_examples_by_problem_code to include these type hints and add the return
annotation -> pd.DataFrame so linters and type checkers know the expected types
(ensure pd is imported from pandas where the file already uses it).
- Around line 9-14: The try/except ImportError fall-back between relative and
absolute imports is fragile; replace the dual import block with a single,
explicit import style (e.g., use absolute imports or a consistent relative
import) for get_reasoning_process_for_fault_category, INCIDENT_ID_COLUMN, and
REQUIRED_COLUMNS so the module resolution is deterministic, and add a short
comment documenting the expected PYTHONPATH/packaging if you choose absolute
imports or update to a proper package-relative import if this file is intended
to be part of the package.
| ```shell | ||
| # Remove auto-recovered incidents | ||
| python src/filtering/filter_rows.py \ | ||
| --input_csv data/categorized_incidents.csv \ | ||
| --output_csv data/filtered_file.csv \ | ||
| --filter_type auto | ||
|
|
||
| # Keep only remotely-solvable incidents | ||
| python src/filtering/filter_rows.py \ | ||
| --input_csv data/filtered_file.csv \ | ||
| --output_csv data/filtered_file.csv \ | ||
| --filter_type soft_solve | ||
|
|
||
| # Keep top 16 fault categories | ||
| python src/filtering/filter_rows.py \ | ||
| --input_csv data/filtered_file.csv \ | ||
| --output_csv data/filtered_file.csv \ | ||
| --filter_type problem_codes | ||
|
|
||
| # Keep top 10 resolution methods | ||
| python src/filtering/filter_rows.py \ | ||
| --input_csv data/filtered_file.csv \ | ||
| --output_csv data/finalized_dataset.csv \ | ||
| --filter_type close_codes |
There was a problem hiding this comment.
In-place CSV overwrite may lose data on failure.
Lines 104–107 and 110–113 use the same file (data/filtered_file.csv) as both --input_csv and --output_csv. If the script fails mid-write, the input data is lost. Consider using distinct intermediate filenames or documenting this risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 96 - 119, The
tutorial currently runs filter_rows.py with the same path for --input_csv and
--output_csv (data/filtered_file.csv) for filter_type values soft_solve and
problem_codes, which risks corrupting the input if the script fails mid-write;
update the example commands to use distinct intermediate output filenames (e.g.,
data/filtered_soft_solve.csv, data/filtered_problem_codes.csv) or explicitly
document the risk and recommend using temporary files and atomic moves,
referencing the filter_rows.py invocation and the --input_csv/--output_csv
parameters so readers can spot and change the commands.
| ns run_cmd \ | ||
| --log_dir=/workspace/prepare-sft-data-indicence \ | ||
| --expname=prep-sft-data-inci \ | ||
| --run_after=solution-generation \ | ||
| --cluster=local \ | ||
| 'python -m nemo_skills.training.prepare_data \ | ||
| --config-path /workspace/data \ | ||
| --config-name noc_reasoning_sft \ | ||
| input_files=/workspace/outputs/training_data_split.jsonl \ | ||
| output_path=/workspace/outputs/sft-data-incidence.jsonl \ | ||
| prompt_config=/workspace/data/prompts/prompt_incident.yaml \ | ||
| tokenizer=Qwen/Qwen3-32B \ | ||
| filters.remove_contaminated=false \ | ||
| add_unlabeled=true \ | ||
| filters.trim_solutions=false' |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's search for any references to "solution-generation"
echo "=== Searching for 'solution-generation' references ==="
rg "solution-generation" --type=md --type=yaml --type=py --type=txt -nRepository: NVIDIA-NeMo/Skills
Length of output: 774
🏁 Script executed:
#!/bin/bash
# Let's examine the file to see the context around the specified lines
echo "=== Content around lines 232-246 (SFT data prep) ==="
sed -n '225,250p' docs/tutorials/posts/noc-reasoning-agent.md
echo ""
echo "=== Content around lines 296-311 (test data prep) ==="
sed -n '290,315p' docs/tutorials/posts/noc-reasoning-agent.mdRepository: NVIDIA-NeMo/Skills
Length of output: 2205
🏁 Script executed:
#!/bin/bash
# Let's search for all --run_after references and experiment definitions
echo "=== All --run_after references in the file ==="
rg "run_after" docs/tutorials/posts/noc-reasoning-agent.md -n
echo ""
echo "=== All --expname references to understand defined experiments ==="
rg "expname" docs/tutorials/posts/noc-reasoning-agent.md -n -A 1 -B 1Repository: NVIDIA-NeMo/Skills
Length of output: 1003
Remove or update --run_after=solution-generation to reference a valid experiment.
The --run_after=solution-generation flag at lines 235 and 300 references an experiment that is never defined in this tutorial. The experiments actually defined are: prepare-noc, gpt-oss-sdg-with-python, prep-sft-data-inci, and training. Remove this flag or update it to reference a valid prior step (e.g., --run_after=gpt-oss-sdg-with-python or --run_after=prepare-noc), otherwise the workflow will fail or behave unexpectedly when users follow the steps sequentially.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 232 - 246, The flag
--run_after=solution-generation references a non-existent experiment and should
be removed or replaced with a valid experiment name; update the ns run_cmd
invocation that contains --run_after=solution-generation to either delete the
--run_after argument or set it to an existing experiment such as
--run_after=gpt-oss-sdg-with-python or --run_after=prepare-noc so the pipeline
will correctly sequence (look for the ns run_cmd block and the --run_after flag
in the noc-reasoning-agent tutorial).
| --log_dir=/workspace/prepare-sft-data-indicence \ | ||
| --expname=prep-sft-data-inci \ |
There was a problem hiding this comment.
Typo: "indicence" → "incidence" in log directory paths.
Both the training and test data preparation commands use indicence instead of incidence:
- Line 233:
--log_dir=/workspace/prepare-sft-data-indicence - Line 298:
--log_dir=/workspace/prepare-test-data-indicence
While these are just log directory names, users following the tutorial will create oddly named directories.
Proposed fix
- --log_dir=/workspace/prepare-sft-data-indicence \
+ --log_dir=/workspace/prepare-sft-data-incidence \- --log_dir=/workspace/prepare-test-data-indicence \
+ --log_dir=/workspace/prepare-test-data-incidence \Also applies to: 298-299
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 233 - 234, Fix the
typo in the log directory flags used in the preparation commands: replace
occurrences of "--log_dir=/workspace/prepare-sft-data-indicence" and
"--log_dir=/workspace/prepare-test-data-indicence" with
"--log_dir=/workspace/prepare-sft-data-incidence" and
"--log_dir=/workspace/prepare-test-data-incidence" respectively so both commands
use "incidence" instead of "indicence".
| def coerce_to_str(value) -> str: | ||
| if pd.isna(value): | ||
| return "" | ||
| # Ensure any non-string is converted to string safely | ||
| try: | ||
| return str(value) | ||
| except Exception: | ||
| return "" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Broad except Exception on str() is unnecessary.
str() will not raise for any standard Python type. This silently swallows real bugs (e.g., a custom __str__ that raises). Per the coding guideline: "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving." Ruff also flags this (BLE001).
Proposed fix
def coerce_to_str(value) -> str:
if pd.isna(value):
return ""
- # Ensure any non-string is converted to string safely
- try:
- return str(value)
- except Exception:
- return ""
+ return str(value)🧰 Tools
🪛 Ruff (0.15.1)
[warning] 65-65: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 59 - 66, The broad try/except in coerce_to_str is unnecessary and
hides real errors; remove the try/except and directly return str(value) after
the pd.isna check so any genuine exceptions (e.g., a broken __str__) will
propagate; update the function coerce_to_str to simply check pd.isna(value) then
return str(value) without catching Exception.
| if args.required_incidents: | ||
| with open(args.required_incidents, "r") as f: | ||
| required_incidents = f.read().splitlines() | ||
| print(required_incidents) | ||
| id_col = INCIDENT_ID_COLUMN if INCIDENT_ID_COLUMN in df.columns else "number" | ||
| df = df[df[id_col].astype(str).isin(required_incidents)] |
There was a problem hiding this comment.
Debug print(required_incidents) left in — will dump a potentially large list to stdout.
Line 145 prints the entire contents of the required incidents file to stdout. This looks like a leftover debug statement and will pollute output for large files.
Proposed fix
if args.required_incidents:
with open(args.required_incidents, "r") as f:
required_incidents = f.read().splitlines()
- print(required_incidents)
id_col = INCIDENT_ID_COLUMN if INCIDENT_ID_COLUMN in df.columns else "number"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 142 - 147, Remove the stray debug print that dumps
required_incidents to stdout: in the block that reads args.required_incidents
(the code that opens args.required_incidents, populates required_incidents, and
then filters df by id_col), delete the print(required_incidents) call (or
replace it with a debug-level logger call if you need it for troubleshooting) so
large lists are not printed to stdout before filtering.
- Move __init__.py creation and PYTHONPATH export into a dedicated "Configure the environment" subsection in the tutorial - Remove per-command PYTHONPATH=$PWD prefixes (now set once at top) - Add all 40 synthetic fault_category values to ALLOWED_PROBLEM_CODES in filter_rows.py and create_input_jsonl_from_incidents.py - Add matching reasoning process mappings in reasoning_processes.py Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Python 3.3+ namespace packages make __init__.py unnecessary. The try/except fallback in create_input_jsonl handles relative imports gracefully. Only PYTHONPATH is needed for all script imports. Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py (1)
6-78:ALLOWED_PROBLEM_CODESis duplicated across files — extract to a shared module to eliminate drift risk.The exact combined list of legacy and synthetic problem codes is maintained independently in both
filter_rows.pyandcreate_input_jsonl_from_incidents.py(lines 17–89). The latter file explicitly notes "Keep in sync with filter_rows.py" (line 16), indicating manual synchronization is expected. Define once in a shared constants module (e.g.,recipes/noc-reasoning-agent/scripts/constants.py) and import from both locations to eliminate the drift risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines 6 - 78, The combined allowed-problem-codes list is duplicated (ALLOWED_PROBLEM_CODES, ALLOWED_PROBLEM_CODES_LEGACY, ALLOWED_PROBLEM_CODES_SYNTHETIC); extract these constants into a single shared module (e.g., a new constants.py) and export ALLOWED_PROBLEM_CODES_LEGACY, ALLOWED_PROBLEM_CODES_SYNTHETIC and ALLOWED_PROBLEM_CODES from there, then replace the local lists in both filter_rows.py and create_input_jsonl_from_incidents.py with imports from the new module so both files reference the single source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py`:
- Around line 106-111: Replace the broad except Exception handlers in the
CSV-reading blocks with except UnicodeDecodeError to only fallback to a
different encoding on decoding errors; update the try/except in filter_auto (and
the three other filter functions referenced) so they attempt
pd.read_csv(input_csv, encoding="latin1") and on UnicodeDecodeError retry
pd.read_csv(input_csv), letting FileNotFoundError/PermissionError and other
unexpected exceptions propagate. Ensure you modify the except clause in the
functions named filter_auto and the three other filter_… functions shown in the
diff.
- Around line 193-202: The current dispatch on args.filter_type silently ignores
unsupported values; add an else branch after the existing if/elif chain that
raises a ValueError (or argparse.ArgumentTypeError) with a clear message
including the invalid args.filter_type and the allowed options ("auto",
"soft_solve", "problem_codes", "close_codes"); update the code around the
filter_auto, filter_soft_solve, filter_problem_codes, and filter_close_codes
dispatch so unsupported values immediately raise this error instead of exiting
silently.
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 123-125: The conditional currently uses truthiness for
args.examples_by_problem_code which treats 0 as falsy and skips extraction;
update the check in the block that calls extract_examples_by_problem_code so it
tests explicitly for None (e.g., if args.examples_by_problem_code is not None)
to allow zero to be a valid value, then call
extract_examples_by_problem_code(df, args.examples_by_problem_code) and print
the extracted count into df as before.
In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py`:
- Around line 200-251: Replace permissive .get() lookups with direct key access
for entries that must exist: change uses of
PROBLEM_CODE_REASONING_PROCESS.get("...","") inside DEFAULT_REASONING_PROCESS
and the SYNTHETIC_FAULT_CATEGORY_REASONING mapping to
PROBLEM_CODE_REASONING_PROCESS["..."] so typos raise KeyError and fail fast;
update every mapping entry (e.g., "power_ac_failure_recovery",
"ran_cell_site_down_recovery", "compute_vm_failure_recovery",
"transport_routing_flap_resolution", "signaling_routing_failure_recovery", etc.)
and DEFAULT_REASONING_PROCESS to use bracket indexing against
PROBLEM_CODE_REASONING_PROCESS.
- Around line 165-197: Several legacy SOP entries are empty (e.g., "VM is in not
ready state", "Link Down", "N2 Link Down" and others) causing
SYNTHETIC_FAULT_CATEGORY_REASONING to produce blank
problem_code_reasoning_process; update the mapping logic so that any synthetic
category that currently maps to one of those empty keys falls back to
DEFAULT_REASONING_PROCESS (the "Service-off" SOP) instead of the empty string,
and/or populate those specific keys with the DEFAULT_REASONING_PROCESS content;
modify references in SYNTHETIC_FAULT_CATEGORY_REASONING and ensure lookups for
"VM is in not ready state", "Link Down", "N2 Link Down" (and the other empty
keys listed) return DEFAULT_REASONING_PROCESS when the entry is empty or
missing.
---
Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 114-131: The docs show running src/filtering/filter_rows.py with
--input_csv and --output_csv pointing to the same path (data/filtered_file.csv),
which risks corrupting the input on write failures; update the tutorial commands
to use distinct intermediate output filenames (e.g.,
data/filtered_file_step1.csv, data/filtered_file_step2.csv) or explicitly
document the risk and recommend using a temp file and atomic rename, referencing
the script name filter_rows.py and the flags --input_csv and --output_csv so
readers know where to change the invocation.
- Around line 243-257: Fix the typo in the ns run_cmd --log_dir value (change
prepare-sft-data-indicence to prepare-sft-data-incidence) and update the
--run_after argument in the ns run_cmd invocation (remove the nonexistent
solution-generation reference or replace it with a valid prior step such as
gpt-oss-sdg-with-python); apply the same two fixes to the duplicate test-data
preparation ns run_cmd block that repeats the same --log_dir and --run_after
mistakes.
- Around line 365-378: Add concrete expected results and any dataset-specific
eval notes to the "Compare Results" section: after the two example commands that
call src/evaluation/problem_code_evaluation.py on outputs/agent_responses.jsonl
and outputs/baseline_agent_responses.jsonl, include approximate accuracy numbers
(e.g., "baseline: ~X% close-code accuracy, fine-tuned: ~Y%") and any special
evaluation flags or dataset prep used when generating those outputs (e.g.,
tokenization, synonym mappings, non-standard inference args). Reference the
script name problem_code_evaluation.py and the output files
outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl so
readers can validate their runs against the documented expectations.
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 16-78: ALLOWED_PROBLEM_CODES is duplicated here and in
filter_rows.py; extract that list into a single shared constant (e.g., export
ALLOWED_PROBLEM_CODES from a new module like shared/problem_codes or
utils/problem_codes) and update both create_input_jsonl_from_incidents.py and
filter_rows.py to import the constant instead of defining it locally; ensure the
new module exports the same name (ALLOWED_PROBLEM_CODES) and update any
references to the variable to use the imported symbol to keep behavior
unchanged.
- Around line 81-88: The coerce_to_str function currently wraps str(value) in a
try/except which silently swallows errors; remove the unnecessary try/except in
coerce_to_str so the function simply returns "" when pd.isna(value) is True and
otherwise returns str(value), allowing any unexpected exceptions from str() to
propagate; update the function body for coerce_to_str accordingly and keep the
function signature unchanged.
- Around line 164-169: Remove the leftover debug print by deleting the
print(required_incidents) call in the args.required_incidents handling block;
locate the block that reads required_incidents (variable required_incidents) and
filters df using id_col (uses INCIDENT_ID_COLUMN or "number") and simply keep
the file read + df = df[df[id_col].astype(str).isin(required_incidents)] logic
without printing the list to stdout.
---
Nitpick comments:
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py`:
- Around line 6-78: The combined allowed-problem-codes list is duplicated
(ALLOWED_PROBLEM_CODES, ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC); extract these constants into a single shared
module (e.g., a new constants.py) and export ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC and ALLOWED_PROBLEM_CODES from there, then
replace the local lists in both filter_rows.py and
create_input_jsonl_from_incidents.py with imports from the new module so both
files reference the single source of truth.
| def filter_auto(input_csv, output_csv): | ||
| """Filter incident data and save results to a new CSV.""" | ||
| try: | ||
| df = pd.read_csv(input_csv, encoding="latin1") | ||
| except Exception: | ||
| df = pd.read_csv(input_csv) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Catch UnicodeDecodeError instead of bare Exception.
All four filter functions use except Exception when only an encoding mismatch is expected. This masks real errors like FileNotFoundError or PermissionError, violating the guideline to let unexpected exceptions fail with clear errors. The same pattern repeats at lines 131-134, 145-148, and 161-164.
Proposed fix (apply to all four functions)
- try:
- df = pd.read_csv(input_csv, encoding="latin1")
- except Exception:
- df = pd.read_csv(input_csv)
+ try:
+ df = pd.read_csv(input_csv, encoding="latin1")
+ except UnicodeDecodeError:
+ df = pd.read_csv(input_csv)As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def filter_auto(input_csv, output_csv): | |
| """Filter incident data and save results to a new CSV.""" | |
| try: | |
| df = pd.read_csv(input_csv, encoding="latin1") | |
| except Exception: | |
| df = pd.read_csv(input_csv) | |
| def filter_auto(input_csv, output_csv): | |
| """Filter incident data and save results to a new CSV.""" | |
| try: | |
| df = pd.read_csv(input_csv, encoding="latin1") | |
| except UnicodeDecodeError: | |
| df = pd.read_csv(input_csv) |
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 110-110: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines
106 - 111, Replace the broad except Exception handlers in the CSV-reading blocks
with except UnicodeDecodeError to only fallback to a different encoding on
decoding errors; update the try/except in filter_auto (and the three other
filter functions referenced) so they attempt pd.read_csv(input_csv,
encoding="latin1") and on UnicodeDecodeError retry pd.read_csv(input_csv),
letting FileNotFoundError/PermissionError and other unexpected exceptions
propagate. Ensure you modify the except clause in the functions named
filter_auto and the three other filter_… functions shown in the diff.
| # Run the filtering process | ||
| if args.filter_type == "auto": | ||
| filter_auto(args.input_csv, args.output_csv) | ||
| elif args.filter_type == "soft_solve": | ||
| filter_soft_solve(args.input_csv, args.output_csv) | ||
| elif args.filter_type == "problem_codes": | ||
| filter_problem_codes(args.input_csv, args.output_csv) | ||
| elif args.filter_type == "close_codes": | ||
| filter_close_codes(args.input_csv, args.output_csv) | ||
|
|
There was a problem hiding this comment.
Invalid --filter_type values are silently ignored.
If a user passes an unsupported filter_type, the script exits successfully having done nothing. Add an else branch that raises an error.
Proposed fix
elif args.filter_type == "close_codes":
filter_close_codes(args.input_csv, args.output_csv)
+ else:
+ parser.error(f"Unknown filter_type: {args.filter_type!r}. Choose from: auto, soft_solve, problem_codes, close_codes")As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines
193 - 202, The current dispatch on args.filter_type silently ignores unsupported
values; add an else branch after the existing if/elif chain that raises a
ValueError (or argparse.ArgumentTypeError) with a clear message including the
invalid args.filter_type and the allowed options ("auto", "soft_solve",
"problem_codes", "close_codes"); update the code around the filter_auto,
filter_soft_solve, filter_problem_codes, and filter_close_codes dispatch so
unsupported values immediately raise this error instead of exiting silently.
| if args.examples_by_problem_code: | ||
| df = extract_examples_by_problem_code(df, args.examples_by_problem_code) | ||
| print(f"Extracted {len(df)} examples from the dataset.") |
There was a problem hiding this comment.
if args.examples_by_problem_code: is falsy when the value is 0.
If a user passes --examples_by_problem_code 0, the truthiness check skips extraction silently instead of producing an empty dataset. Use is not None to distinguish "not provided" from "zero."
Proposed fix
- if args.examples_by_problem_code:
+ if args.examples_by_problem_code is not None:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 123 - 125, The conditional currently uses truthiness for
args.examples_by_problem_code which treats 0 as falsy and skips extraction;
update the check in the block that calls extract_examples_by_problem_code so it
tests explicitly for None (e.g., if args.examples_by_problem_code is not None)
to allow zero to be a valid value, then call
extract_examples_by_problem_code(df, args.examples_by_problem_code) and print
the extracted count into df as before.
| "Site Not Scrolling": """ | ||
|
|
||
| """, | ||
| "Sleepy Cell": """ | ||
|
|
||
| """, | ||
| "VM is in not ready state": """ | ||
|
|
||
| """, | ||
| "Prach 0": """ | ||
|
|
||
| """, | ||
| "N2 Link Down": """ | ||
|
|
||
| """, | ||
| "ueconmgr pod restarted": """ | ||
|
|
||
| """, | ||
| "CSR Not Reachable": """ | ||
|
|
||
| """, | ||
| "Circuit Down": """ | ||
|
|
||
| """, | ||
| "Link Down": """ | ||
|
|
||
| """, | ||
| "GPS Sync": """ | ||
|
|
||
| """, | ||
| "MTA Alert": """ | ||
|
|
||
| """, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and locate the relevant dictionaries
cd recipes/noc-reasoning-agent/scripts/utils/ 2>/dev/null || cd .
find . -name "reasoning_processes.py" -type fRepository: NVIDIA-NeMo/Skills
Length of output: 86
🏁 Script executed:
# Read the file to find PROBLEM_CODE_REASONING_PROCESS and SYNTHETIC_FAULT_CATEGORY_REASONING
cat -n recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py | head -250Repository: NVIDIA-NeMo/Skills
Length of output: 21411
🏁 Script executed:
# Get total line count to understand file size
wc -l recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.pyRepository: NVIDIA-NeMo/Skills
Length of output: 130
🏁 Script executed:
# Continue reading the file to see remaining SYNTHETIC_FAULT_CATEGORY_REASONING mappings
tail -n +250 recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py | head -30Repository: NVIDIA-NeMo/Skills
Length of output: 823
Populate empty legacy SOPs or map synthetic categories to a non-empty fallback.
Lines 165–197 contain 11 empty reasoning processes (only whitespace): "Site Not Scrolling", "Sleepy Cell", "VM is in not ready state", "Prach 0", "N2 Link Down", "ueconmgr pod restarted", "CSR Not Reachable", "Circuit Down", "Link Down", "GPS Sync", and "MTA Alert".
SYNTHETIC_FAULT_CATEGORY_REASONING actively maps synthetic categories to these empty entries:
- 8 compute categories →
"VM is in not ready state"(empty) - 4 transport categories →
"Link Down"(empty) - 4 signaling categories →
"N2 Link Down"(empty)
This produces blank problem_code_reasoning_process values in generated training data, degrading quality. Use DEFAULT_REASONING_PROCESS (which maps to the populated "Service-off" SOP) as a fallback for these mappings.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py` around
lines 165 - 197, Several legacy SOP entries are empty (e.g., "VM is in not ready
state", "Link Down", "N2 Link Down" and others) causing
SYNTHETIC_FAULT_CATEGORY_REASONING to produce blank
problem_code_reasoning_process; update the mapping logic so that any synthetic
category that currently maps to one of those empty keys falls back to
DEFAULT_REASONING_PROCESS (the "Service-off" SOP) instead of the empty string,
and/or populate those specific keys with the DEFAULT_REASONING_PROCESS content;
modify references in SYNTHETIC_FAULT_CATEGORY_REASONING and ensure lookups for
"VM is in not ready state", "Link Down", "N2 Link Down" (and the other empty
keys listed) return DEFAULT_REASONING_PROCESS when the entry is empty or
missing.
| # Default reasoning process when fault_category (synthetic workflow ID) is not in PROBLEM_CODE_REASONING_PROCESS. | ||
| DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "") | ||
|
|
||
| # Synthetic data uses fault_category = workflow IDs (e.g. power_ac_failure_recovery). Map them to a reasoning process. | ||
| SYNTHETIC_FAULT_CATEGORY_REASONING = { | ||
| # Power / Environment → mapped to Service-off SOP (power-related outage workflow) | ||
| "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "env_battery_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "env_cabinet_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| # RAN → mapped to closest legacy SOP | ||
| "ran_software_upgrade_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), | ||
| "ran_cell_site_down_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Node Down", ""), | ||
| "ran_interference_mitigation": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_speed_complaint_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_voice_quality_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_sector_outage_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Disabled Cells", ""), | ||
| "ran_prb_availability_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_cell_overshooting_correction": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_rru_communication_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""), | ||
| "ran_dropped_calls_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_parameter_correction": PROBLEM_CODE_REASONING_PROCESS.get("Disabled Cells", ""), | ||
| "ran_antenna_tilt_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_vswr_alarm_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_handover_failure_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_backhaul_degradation_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""), | ||
| "ran_cell_congestion_management": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""), | ||
| "ran_device_issue_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""), | ||
| # Compute → mapped to VM/container legacy SOPs | ||
| "compute_vm_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_container_crash_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_orchestrator_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_image_pull_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_k8s_node_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_storage_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_cnf_pod_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| "compute_resource_exhaustion_resolution": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""), | ||
| # Transport → mapped to Link Down / Circuit Down SOPs | ||
| "transport_routing_flap_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""), | ||
| "transport_microwave_degradation_response": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""), | ||
| "transport_interface_errors_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""), | ||
| "transport_packet_loss_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""), | ||
| # Signaling → mapped to N2 Link Down SOP | ||
| "signaling_routing_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""), | ||
| "signaling_delay_resolution": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""), | ||
| "signaling_s1_n2_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""), | ||
| "signaling_sip_registration_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""), |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Use direct dictionary access instead of .get() for keys that must exist.
Every key referenced in SYNTHETIC_FAULT_CATEGORY_REASONING (and line 201) is defined in PROBLEM_CODE_REASONING_PROCESS above. Using .get(key, "") silently returns an empty string if a key is mistyped, hiding bugs. Use PROBLEM_CODE_REASONING_PROCESS[key] to fail fast on typos.
Proposed fix (showing first few lines; apply to all)
-DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "")
+DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS["Service-off"]
SYNTHETIC_FAULT_CATEGORY_REASONING = {
- "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
- "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+ "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+ "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
# ... apply to all remaining entriesAs per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py` around
lines 200 - 251, Replace permissive .get() lookups with direct key access for
entries that must exist: change uses of
PROBLEM_CODE_REASONING_PROCESS.get("...","") inside DEFAULT_REASONING_PROCESS
and the SYNTHETIC_FAULT_CATEGORY_REASONING mapping to
PROBLEM_CODE_REASONING_PROCESS["..."] so typos raise KeyError and fail fast;
update every mapping entry (e.g., "power_ac_failure_recovery",
"ran_cell_site_down_recovery", "compute_vm_failure_recovery",
"transport_routing_flap_resolution", "signaling_routing_failure_recovery", etc.)
and DEFAULT_REASONING_PROCESS to use bracket indexing against
PROBLEM_CODE_REASONING_PROCESS.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
docs/tutorials/posts/noc-reasoning-agent.md (1)
2-2: Staledatein front matter — should reflect 2026.The post date
2025-09-05predates the PR creation date (2026-02-13) by ~5 months. This will affect any ordering or display logic on the docs site.✏️ Proposed fix
-date: 2025-09-05 +date: 2026-02-13🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` at line 2, Update the YAML front-matter `date:` field (currently "2025-09-05") in the document's top matter to a 2026 date consistent with the PR (e.g., "2026-02-13") so the post ordering/display logic reflects the correct publication time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 344-348: The --weights_dir argument passed to the
create_agent_with_tools_batch.py invocation uses a relative path which will fail
if the current working directory isn't the project workspace; update the CLI in
noc-reasoning-agent.md so the --weights_dir points to the absolute model
location used when saving (or construct it from the workspace variable), e.g.
replace the relative "training/qwen3-32b-improved-hf" value with the absolute
path that was used when saving the model or a resolved ${WORKSPACE}/training/...
equivalent so create_agent_with_tools_batch.py can find the weights regardless
of caller CWD.
- Around line 154-167: The two run commands reuse the same --expname (currently
"--expname=gpt-oss-sdg-with-python"), which will cause experiment-tracking
collisions between Phase 1 and Phase 2; update each command's --expname so they
are unique (e.g., "--expname=gpt-oss-sdg-with-python-phase1" and
"--expname=gpt-oss-sdg-with-python-phase2") wherever the flag appears in the
tutorial commands to ensure separate job names, log dirs, and resume behavior.
---
Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 240-241: Fix the typo "indicence" to "incidence" in the command
flag string shown in the diff; update the value passed to --log_dir (and any
matching occurrence, e.g., the --expname value "prep-sft-data-inci" if intended)
so the directory/experiment name uses "incidence" instead of "indicence" to
ensure consistent naming.
- Around line 239-253: The --run_after flag in the ns run_cmd invocation
references a non-existent experiment name "solution-generation"; update the
--run_after value (or remove the flag) to match an actual experiment ID defined
in your workflow/namespace so the job dependency is valid—locate the ns run_cmd
call in the docs (the block using ns run_cmd with --expname=prep-sft-data-inci)
and replace "solution-generation" with the correct experiment name or omit
--run_after if no dependency is needed.
- Around line 304-319: Fix the two issues in the ns run_cmd invocation: change
the --run_after value from "solution-generation" to the correct job name
"solution_generation" (underscore) so the pipeline will trigger at the right
step, and correct the typo in --log_dir and --expname where "indicence" appears
by renaming it to "incidence" (e.g.,
--log_dir=/workspace/prepare-test-data-incidence and
--expname=prep-test-data-inci -> ensure any matching identifiers use
"incidence"). Use the flags --run_after and --log_dir in the provided command to
locate and update the values.
- Around line 295-374: The Evaluation section is missing expected results and
dataset/benchmark details; update docs/tutorials/posts/noc-reasoning-agent.md
(under the "Evaluation" or after "Compare Results") to include: a short
"Expected results" subsection with example accuracy numbers for the fine-tuned
model (training/qwen3-32b-improved-hf) and baseline (Qwen/Qwen3-32B), exact
commands already shown (e.g. create_agent_with_tools_batch.py and
problem_code_evaluation.py) plus the metric names reported (close-code accuracy)
and example output lines, and a "Dataset notes" subsection describing any
special preparation flags used (prepare_data flags like
filters.remove_contaminated, add_unlabeled, filters.trim_solutions) and
non-standard inference args so readers can reproduce the benchmark.
- Around line 103-126: The commands in the tutorial overwrite the same CSV file
in-place which risks data loss; update the examples that call
src/filtering/filter_rows.py to avoid using the same path for --input_csv and
--output_csv (use distinct intermediate filenames like data/filtered_step1.csv,
data/filtered_step2.csv, etc.), or modify the script (functionality in
src/filtering/filter_rows.py that parses --input_csv and --output_csv) to
support a safe --inplace flag that writes to a temp file and atomically replaces
the original; ensure all example invocations reference unique output filenames
or document the new --inplace behavior.
---
Nitpick comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Line 2: Update the YAML front-matter `date:` field (currently "2025-09-05") in
the document's top matter to a 2026 date consistent with the PR (e.g.,
"2026-02-13") so the post ordering/display logic reflects the correct
publication time.
| --expname=gpt-oss-sdg-with-python \ | ||
| --model=openai/gpt-oss-120b \ | ||
| --server_gpus=8 \ | ||
| --output_dir=/workspace/outputs/sdg/ \ | ||
| --input_file=/workspace/outputs/input_incident.jsonl \ | ||
| ++prompt_config=/workspace/data/prompts/formatting_prompt.yaml \ | ||
| ++inference.tokens_to_generate=8192 \ | ||
| ++inference.temperature=0.6 \ | ||
| ++chat_template_kwargs.reasoning_effort=medium \ | ||
| ++inference.endpoint_type=text \ | ||
| ++code_execution=false \ | ||
| ++server.enable_soft_fail=True \ | ||
| ++skip_filled=False --rerun_done | ||
| ``` |
There was a problem hiding this comment.
Both generation phases share the same --expname, which can cause experiment-tracking conflicts.
Phase 1 (Line 154) and Phase 2 (Line 191) both use --expname=gpt-oss-sdg-with-python. Nemo-Skills uses the experiment name for job naming, log directories, and resume logic. Reusing the same name for a separate phase risks collisions in the scheduler, ambiguous log attribution, or the second run being silently treated as a resume of the first. Use distinct names for each phase.
✏️ Proposed fix
- --expname=gpt-oss-sdg-with-python \ # Phase 1 (line 154)
+ --expname=gpt-oss-sdg-phase1 \- --expname=gpt-oss-sdg-with-python \ # Phase 2 (line 191)
+ --expname=gpt-oss-sdg-phase2 \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 154 - 167, The two
run commands reuse the same --expname (currently
"--expname=gpt-oss-sdg-with-python"), which will cause experiment-tracking
collisions between Phase 1 and Phase 2; update each command's --expname so they
are unique (e.g., "--expname=gpt-oss-sdg-with-python-phase1" and
"--expname=gpt-oss-sdg-with-python-phase2") wherever the flag appears in the
tutorial commands to ensure separate job names, log dirs, and resume behavior.
| python src/create_agent_with_tools_batch.py \ | ||
| --input outputs/final_agent_input.jsonl \ | ||
| --output outputs/agent_responses.jsonl \ | ||
| --weights_dir training/qwen3-32b-improved-hf | ||
| ``` |
There was a problem hiding this comment.
--weights_dir uses a relative path — will break if CWD ≠ /workspace.
Line 347 passes training/qwen3-32b-improved-hf as a relative path, but the model is saved to the absolute path /workspace/training/qwen3-32b-improved-hf (line 272). The create_agent_with_tools_batch.py script is invoked as a plain python command (not via ns run_cmd), so the effective CWD is the caller's shell, not /workspace. If the user is not cd'd into /workspace, the path will not resolve.
✏️ Proposed fix
python src/create_agent_with_tools_batch.py \
--input outputs/final_agent_input.jsonl \
--output outputs/agent_responses.jsonl \
- --weights_dir training/qwen3-32b-improved-hf
+ --weights_dir /workspace/training/qwen3-32b-improved-hf📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python src/create_agent_with_tools_batch.py \ | |
| --input outputs/final_agent_input.jsonl \ | |
| --output outputs/agent_responses.jsonl \ | |
| --weights_dir training/qwen3-32b-improved-hf | |
| ``` | |
| python src/create_agent_with_tools_batch.py \ | |
| --input outputs/final_agent_input.jsonl \ | |
| --output outputs/agent_responses.jsonl \ | |
| --weights_dir /workspace/training/qwen3-32b-improved-hf |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 344 - 348, The
--weights_dir argument passed to the create_agent_with_tools_batch.py invocation
uses a relative path which will fail if the current working directory isn't the
project workspace; update the CLI in noc-reasoning-agent.md so the --weights_dir
points to the absolute model location used when saving (or construct it from the
workspace variable), e.g. replace the relative "training/qwen3-32b-improved-hf"
value with the absolute path that was used when saving the model or a resolved
${WORKSPACE}/training/... equivalent so create_agent_with_tools_batch.py can
find the weights regardless of caller CWD.
@gwarmstrong This is the Telco Team. Per our conversation two weeks ago, we would like to add this Tutorial to create a reasoning model for Telco NoC Incidences. Thanks
Summary by CodeRabbit