Add noc reasoning tutorial by acanaveras · Pull Request #1242 · NVIDIA-NeMo/Skills

acanaveras · 2026-02-13T03:53:27Z

@gwarmstrong This is the Telco Team. Per our conversation two weeks ago, we would like to add this Tutorial to create a reasoning model for Telco NoC Incidences. Thanks

Summary by CodeRabbit

New Features
- NOC Reasoning Agent toolkit: data-prep pipelines, synthetic data generation, single & batch agent runners, evaluation/judging tools, filtering & keyword utilities, token-usage analysis, visualization and export utilities for incident reasoning workflows.
Documentation
- Added a comprehensive tutorial covering setup, data workflows, fine‑tuning, evaluation, and deployment guidance.
Chores
- Added/expanded ignore rules to exclude generated site and project artifacts.

Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com>

greptile-apps

_{31 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-13T03:56:00Z

.gitignore

+site/
+site/
+site/
+site/
+site/


site/ is duplicated 5 times

Suggested change

site/

site/

site/

site/

site/

site/

greptile-apps · 2026-02-13T03:56:01Z

docs/tutorials/posts/noc-reasoning-agent.md

@@ -0,0 +1,377 @@
+---
+date: 2025-09-05


The date 2025-09-05 is in the future. Should be a valid past date

greptile-apps · 2026-02-13T03:56:02Z

recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py

+    skip = True
+    if skip and os.path.exists(output_path):
+        os.remove(output_path)


The variable skip is hardcoded to True, which unconditionally deletes the output file on every run. This destroys the resume functionality mentioned on line 136. Either remove this code block or make skip a CLI argument

greptile-apps · 2026-02-13T03:56:03Z

recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py

+from src.tools import (
+    ALL_TOOLS,
+    Check_Alarm_Status,
+    Check_Apply_Configuration,
+    Check_Element_Health,
+    Check_Element_Neighbors,
+    Check_External_Issues,
+    Check_Performance,
+    Check_remote_files,
+    Create_Ticket,
+    Execute_Remote_Action,
+    Orchestration_tool,
+    Triage_Toolkit_Tool,
+)


These imports reference src.tools but the script is located in recipes/noc-reasoning-agent/scripts/. Check that PYTHONPATH is set correctly before running, otherwise imports will fail

greptile-apps · 2026-02-13T03:56:04Z

recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py

+    react_agent = create_react_agent(chat_llm, tools=[], checkpointer=checkpointer)
+    for index, row in tqdm(unprocessed_df.iterrows(), total=len(unprocessed_df), desc="Processing Rows"):
+        # print(row)
+        # exit()
+
+        # --- Step 2: Get the row-specific tools using the helper function. ---
+        # row_tools = get_bound_tools(row)
+
+        config = {"configurable": {"thread_id": str(index)}}
+        input_question = row["input"]
+        system_message = row["system"]
+        input_messages = [{"role": "system", "content": system_message}, {"role": "user", "content": input_question}]
+
+        try:
+            # We still invoke one-by-one, but the tool creation is now efficient.
+
+            ## WE need to fix this. Add system message and tool calling messages
+            # separate user input messages
+            TOOLS = {
+                "Check_Alarm_Status": lambda args: Check_Alarm_Status(**args),
+                "Check_Element_Neighbors": lambda args: Check_Element_Neighbors(**args),
+                "Check_Element_Health": lambda args: Check_Element_Health(**args),
+                "Execute_Remote_Action": lambda args: Execute_Remote_Action(**args),
+                "Check_External_Issues": lambda args: Check_External_Issues(**args),
+                "Check_Apply_Configuration": lambda args: Check_Apply_Configuration(**args),
+                "Check_Performance": lambda args: Check_Performance(**args),
+                "Create_Ticket": lambda args: Create_Ticket(**args),
+                "Orchestration_tool": lambda args: Orchestration_tool(**args),
+                "Triage_Toolkit_Tool": lambda args: Triage_Toolkit_Tool(**args),
+                "Check_remote_files": lambda args: Check_remote_files(**args),
+            }


Created ReAct agent with empty tools=[] on line 174, then manually maps tool calls on lines 192-204. Consider using LangGraph's built-in tool binding (as shown in create_agent_with_tools.py) to simplify this logic

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

coderabbitai · 2026-02-13T03:59:18Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a complete NOC Reasoning Agent recipe: docs, prompts, configs, SFT data-prep, synthetic-data pipelines, domain tool stubs, ReAct agent runners (single & batch), numerous data-processing/evaluation/visualization scripts, schema/constants, and recipe-level .gitignore; top-level .gitignore gained duplicate site/ entries. (50 words)

Changes

Cohort / File(s)	Summary
Documentation & ignores `docs/tutorials/posts/noc-reasoning-agent.md`, `.gitignore`, `recipes/noc-reasoning-agent/.gitignore`	Adds tutorial and recipe .gitignore; top-level `.gitignore` contains multiple duplicate `site/` lines.
Configs `recipes/noc-reasoning-agent/configs/config.ini`, `recipes/noc-reasoning-agent/configs/noc_reasoning_sft.yaml`, `recipes/noc-reasoning-agent/configs/noc_reasoning_sft_6.yaml`	New model/data path mappings and two comprehensive SFT dataset-prep YAMLs defining processors, filters, grouping, sampling, and final manifest outputs.
Prompts `recipes/noc-reasoning-agent/prompts/...` `formatting_prompt.yaml`, `prompt_incident.yaml`, `prompt_reasoning.yaml`, `shortened_prompt_reasoning.yaml`	New role- and tool-aware prompts: structured JSON formatter, XML tool signatures, chain-of-thought and short-reasoning templates with strict output formats and examples.
Tools & agent runners `recipes/noc-reasoning-agent/scripts/tools.py`, `recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py`, `recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py`	Adds domain tool stubs (11 tools) and two ReAct agent runners with model/tokenizer loading, tool binding, resumable processing, error-recording, and batch IO logic.
Filtering & categorization `recipes/noc-reasoning-agent/scripts/filtering/...` `filter_rows.py`, `match_keywords.py`	CSV filtering utilities (auto-recover, soft-solve, problem/close code filters) and keyword-based close-code categorization including synthetic mappings.
Data prep & reasoning utils `recipes/noc-reasoning-agent/scripts/utils/...` many files (create_input_jsonl_from_incidents.py, format_reasoning_json.py, reasoning_processes.py, schema_columns.py, split_incident_data.py, split_mocktools_answers.py, token_usage.py)	CSV→JSONL conversion, reasoning-process repository, schema constants, reasoning-trace extraction/merging/token conversion, curriculum generation, train/test splitting, mock-tool merging, and token-usage analysis.
NS pipelines `recipes/noc-reasoning-agent/scripts/ns_pipelines/...` `generate_synthetic_data.py`, `prepare_react_agent.py`	Wrappers for Nemo-Skills synthetic-data generation and a consolidator to merge JSONL sources into ReAct-ready records.
Evaluation `recipes/noc-reasoning-agent/scripts/evaluation/...` `evaluation_with_judge.py`, `problem_code_evaluation.py`, `score.py`	Evaluation pipelines computing ROUGE/BERTScore, invoking LLM judges for reasoning/conclusion scoring, close-code correctness checker, and score aggregation.
Visualization `recipes/noc-reasoning-agent/scripts/visualization/...` `extract_representation_columns.py`, `extract_scores.py`, `generate_trace_visualization.py`	Enrichment and sampling utilities, evaluation-summary extraction, and an HTML generator to visualize step-by-step reasoning traces and work notes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DataPrep as Data Prep Pipeline
    participant SFT as SFT Config
    participant Model as Qwen Model
    participant Trainer as Training Engine

    User->>DataPrep: provide CSV incidents
    DataPrep->>DataPrep: filter, categorize, convert to JSONL
    DataPrep->>SFT: prepared JSONL manifest
    SFT->>SFT: apply filters, dedupe, shuffle, write manifest
    SFT->>Model: deliver formatted SFT manifest
    Model->>Trainer: tokenize & fine-tune (Megatron/vLLM)
    Trainer->>User: trained model artifact

sequenceDiagram
    participant Runner as Agent Runner
    participant Memory as Checkpointer
    participant LM as Language Model
    participant ToolExec as Tool Executor
    participant Tools as Domain Tools
    participant Output as JSONL Output

    Runner->>Memory: load processed indices / resume
    Runner->>LM: send system+user prompt
    LM->>Runner: return generation (tool_call or conclusion)
    alt tool_call
        Runner->>ToolExec: parse tool_call
        ToolExec->>Tools: execute bound tool(s)
        Tools->>ToolExec: return tool responses
        ToolExec->>Runner: provide observation
        Runner->>LM: append observation and continue
    else conclusion
        Runner->>Memory: mark processed
        Runner->>Output: append response (row_index, input, result)
    end

sequenceDiagram
    participant Eval as Evaluation Pipeline
    participant Metrics as Metric Calculators
    participant LLMJudge as LLM Judge
    participant Out as Output JSONL

    Eval->>Eval: read agent responses + expected
    Eval->>Metrics: compute ROUGE & BERTScore
    Metrics->>Eval: return metric scores
    Eval->>LLMJudge: request conclusion & reasoning score
    LLMJudge->>Eval: return score + rationale
    Eval->>Out: write augmented JSONL with metrics and judges

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

run GPU tests

Suggested reviewers

activatedgeek
Kipok

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 54.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add noc reasoning tutorial' directly aligns with the PR's primary objective of adding a tutorial for a NoC reasoning agent. It is concise, specific, and accurately describes the main change.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py`:
- Around line 132-135: The code currently hardcodes skip = True which causes
deletion of output_path every run and breaks resume logic; change this so skip
is not hardcoded—make it a CLI flag, environment variable, or default False—and
only call os.remove(output_path) when skip is explicitly set by the user; ensure
the block around skip, os.path.exists(output_path), and os.remove(output_path)
is gated by the configurable skip variable and leave
load_processed_indices(output_path) unchanged so resume logic
(load_processed_indices / processed_indices) can work when skip is False.
- Around line 156-157: The call to AutoModelForCausalLM.from_pretrained in
create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so
the intended precision is ignored; update the call to pass the dtype via
torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified
bfloat16/float16 precision, ensuring you modify the
AutoModelForCausalLM.from_pretrained(...) invocation accordingly.

In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py`:
- Line 12: The parser.add_argument("--output_file", help="Path to output") can
leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file,
...) to raise a ValueError; update the argument declaration
(parser.add_argument) to either set required=True or provide a sensible default
(e.g., default="evaluation_results.json"), and/or add a guard before writing
that validates args.output_file is not None and raises a clear error message;
ensure references to args.output_file and the DataFrame.to_json call are updated
accordingly.

In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py`:
- Around line 153-166: The code currently uses two regex matches (m and
m_no_args) and then unconditionally calls m.groups(), which creates dead code
and a latent crash; instead, after performing the initial match m =
re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is
truthy and if not immediately return the original string (or handle the error),
and only then call m.groups() to assign tool_name and raw_args; remove the
separate m_no_args branch entirely so empty brackets are handled by the first
regex and m.groups() is always called on a valid match.

In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py`:
- Around line 207-276: The __main__ block contains leftover debug logic that
hardcodes csv_file/model, calls AutoTokenizer directly (unsafe if transformers
missing), and calls exit() before main(), making main() unreachable; remove or
replace the entire if __name__ == "__main__": block so it simply parses
CLI/default args and invokes main(), ensure any tokenizer creation uses the
existing guarded import/creation logic (e.g., reuse the module's
AutoTokenizer-safe factory or wrap AutoTokenizer.from_pretrained in the same
try/except used elsewhere), eliminate hardcoded paths and stray exit() calls,
and wire csv_file, column_name, model, output_csv, output_plot into main()
parameters or argparse so the script runs via main() correctly.

In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`:
- Around line 26-71: The function parse_final_reasoning contains an early return
and debug prints that make the intended parsing logic unreachable; remove the
premature return and the print statements and restructure the function to (1)
first find the final "Finish" block using trace_block_match or last_index as a
fallback, (2) set finish_text deterministically (return a clear error string if
neither is found), (3) run the step_pattern regex against that finish_text to
extract Thought/Action/Observation entries, format them into parsed_steps, and
finally return the joined string; refer to parse_final_reasoning,
trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when
making these changes.
- Around line 74-88: The function find_finish_action assigns finish_text only
when last_index != -1 but always returns finish_text, causing UnboundLocalError
when "Finish" isn't found; update find_finish_action to handle the not-found
case by returning a safe default (e.g., None or empty string) or raising a clear
exception, and ensure finish_text is defined in both branches (use last_index,
generation_text, and finish_text to locate/construct the return value or the
fallback).

🟠 Major comments (22)

recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-33-34 (1)
33-34: ⚠️ Potential issue | 🟠 Major

Stub silently does nothing — user gets no feedback when using a non-default --llm.

If a user passes --llm with any value other than "qwen2.5-32b-instruct", generate_synthetic_data_oss_gpt is called and silently completes without action. This violates the principle that errors should not pass silently. Either raise NotImplementedError or restrict the --llm choices.
♻️ Suggested fix (option A: raise error)
 def generate_synthetic_data_oss_gpt(args, cluster, num_gpus):
-    pass
+    raise NotImplementedError("OSS GPT generation path is not yet implemented")
♻️ Suggested fix (option B: restrict choices)
-    parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation")
+    parser.add_argument(
+        "--llm", type=str, default="qwen2.5-32b-instruct",
+        choices=["qwen2.5-32b-instruct"],
+        help="The LLM to use for generation",
+    )
As per coding guidelines, "Follow the Zen of Python principles: … ensure errors never pass silently unless explicitly silenced".

Also applies to: 57-62
recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-43-55 (1)
43-55: ⚠️ Potential issue | 🟠 Major

--gpus argument is defined but never used; --num_gpus duplicates its purpose.

A user setting --gpus 4 would have no effect — only --num_gpus is read (line 54). Additionally, line 55 misleadingly prints "specified via --gpus". Remove the unused --gpus argument.
♻️ Suggested fix
-    # parser.add_argument('--gpus', type=int, default=detect_num_gpus(),
-    #                     help='Number of GPUs to use for generation')
-    parser.add_argument("--gpus", type=int, default=8, help="Number of GPUs to use for generation")
     parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation")
     parser.add_argument(
         "--num_gpus", type=int, default=8, help="Number of GPUs to use (auto-detected if not specified)"
     )
 
     args = parser.parse_args()
     cluster = "local"
 
     num_gpus = args.num_gpus
-    print(f"Using {num_gpus} GPUs (specified via --gpus)")
+    print(f"Using {num_gpus} GPUs (specified via --num_gpus)")
As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-47-56 (1)
47-56: 🛠️ Refactor suggestion | 🟠 Major

Compute all enrichment before writing the output file.

The current code reads, enriches, and writes line-by-line in a single pass. If an error occurs mid-processing (e.g., malformed JSON on line N), a partial output file is left behind. Separate the read/compute phase from the write phase.
Proposed fix
-enriched_rows = []
-with open(jsonl_path, "r", encoding="utf-8") as fin, open(output_jsonl_path, "w", encoding="utf-8") as fout:
+enriched_rows = []
+with open(jsonl_path, "r", encoding="utf-8") as fin:
     for line in fin:
         if not line.strip():
             continue
         row = json.loads(line)
         row = enrich_row(row, csv_idx)
         enriched_rows.append(row)
-        fout.write(json.dumps(row, ensure_ascii=False) + "\n")
+
+with open(output_jsonl_path, "w", encoding="utf-8") as fout:
+    for row in enriched_rows:
+        fout.write(json.dumps(row, ensure_ascii=False) + "\n")
As per coding guidelines, "perform all computations before re-opening files for writing to avoid accidental data loss if code fails during execution".
recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-200-212 (1)
200-212: 🛠️ Refactor suggestion | 🟠 Major

Use direct dictionary access instead of .get() — key is guaranteed to exist.

"Service-off" is defined in PROBLEM_CODE_REASONING_PROCESS at line 2, so .get("Service-off", "") masks a potential future error if the key were removed. Direct access ["Service-off"] will fail fast with a clear KeyError.
♻️ Proposed fix
-DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "")
+DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS["Service-off"]

 SYNTHETIC_FAULT_CATEGORY_REASONING = {
-    "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
-    "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
+    "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"],
 }
As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors".
recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-35-42 (1)
35-42: 🛠️ Refactor suggestion | 🟠 Major

File is read and parsed twice; reuse the existing DataFrame.

The JSONL file is loaded identically at lines 9–12 and 36–39, and df from line 15 already holds the same data. Remove the second read and reuse the existing df.
♻️ Proposed fix
-# Load JSONL file into a list
-data = []
-with open(file_path, "r", encoding="utf-8") as f:
-    for line in f:
-        data.append(json.loads(line))
-
-# Convert to DataFrame
-df = pd.DataFrame(data)
-
 # Select only the relevant columns
 columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"]
recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-121-125 (1)
121-125: ⚠️ Potential issue | 🟠 Major

BERTScore is computed per-row — extremely slow due to repeated model loading.

bert_score() is called once per row (line 122), which reloads the model each time. For a dataset of any meaningful size, this will be orders of magnitude slower than a single batched call. Collect all pairs first and call bert_score() once.
⚡ Sketch of batched approach
# Collect all pairs first
references = []
candidates = []
for index, row in df.iterrows():
    # ... extract generated_reasoning_part and expected content ...
    candidates.append(generated_reasoning_part)
    references.append(conclusion_expected + reasoning_expected)

# Single batched BERTScore call
P, R, F1 = bert_score(candidates, references, lang="en", verbose=True)

# Then iterate again to assemble output rows with F1[i].item()
recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-221-293 (1)

221-293: 🛠️ Refactor suggestion | 🟠 Major

File is read twice and contains ~35 lines of commented-out code.

The input file is read manually (lines 224–226) and then again via pd.read_json (line 233). The first read is entirely superseded by the DataFrame path (line 293 overwrites data). Remove the first read. Also, the large commented-out selection_criteria block (lines 246–280) and scattered debug comments should be cleaned up before merging.
recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-28-29 (1)
28-29: ⚠️ Potential issue | 🟠 Major

O(n) membership checks on lists — use sets for train/test partitions.

train_set and test_set are lists, so number in train_set (line 42) and number in test_set (line 57) are O(n) per lookup. With potentially thousands of incidents across multiple iteration files, this becomes a significant bottleneck.
⚡ Proposed fix
-    train_set = incidents[:n_train]
-    test_set = incidents[n_train:]
+    train_set = set(incidents[:n_train])
+    test_set = set(incidents[n_train:])
Also applies to: 42-42, 57-57
recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-192-218 (1)
192-218: ⚠️ Potential issue | 🟠 Major

Unescaped values injected into HTML summary.

Lines 198–199 insert incident_data.get(...) values directly into the HTML without calling esc(), unlike the rest of the template. This is an XSS vector if incident data contains HTML-special characters.
🛡️ Proposed fix
-            <li>Category: {incident_data.get("incident_classification", incident_data.get("category"))}</li>
-            <li>Problem Code: {incident_data.get("fault_category", incident_data.get("u_problem_code"))}</li>
+            <li>Category: {esc(incident_data.get("incident_classification", incident_data.get("category")))}</li>
+            <li>Problem Code: {esc(incident_data.get("fault_category", incident_data.get("u_problem_code")))}</li>
recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-50-51 (1)

50-51: ⚠️ Potential issue | 🟠 Major

\think should be <think> to match the pipeline's expected tag format.

The YAML prompt files use \think, but split_incident_data.py (line 67) injects background = "\n<think>\n" with angle brackets. The training configuration (noc_reasoning_sft.yaml, noc_reasoning_sft_6.yaml) also filters for and expects XML-style <think> and </think> tags (e.g., "<think>thinking</think>summary"). In YAML block scalars, \think is treated as literal text, not an escape sequence, so it won't match the <think> tags injected by the pipeline. This issue also appears in prompt_reasoning.yaml (line 166) and shortened_prompt_reasoning.yaml (line 147).

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-9-10 (1)

9-10: ⚠️ Potential issue | 🟠 Major

Relative imports require package initialization; this will fail with ImportError.

from .reasoning_processes import ... and from .schema_columns import ... (lines 9-10) perform relative imports, which require the parent directory to be a Python package (with __init__.py). There is no __init__.py in recipes/noc-reasoning-agent/scripts/utils/, scripts/, or any parent directory. Running the script directly with python create_input_jsonl_from_incidents.py will raise ImportError: attempted relative import with no known parent package.

All other scripts in this repository use absolute imports; switch to those or add __init__.py files to make utils/ and scripts/ proper packages.
recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-93-98 (1)
93-98: 🛠️ Refactor suggestion | 🟠 Major

Duplicated CSV loading with overly broad except Exception — extract a helper.

The try: read_csv(encoding="latin1") except Exception: read_csv() pattern is repeated four times. The broad except Exception catches everything (e.g., FileNotFoundError, MemoryError) when only encoding errors are expected.

Extract a helper and narrow the exception:
Suggested refactor
+def _load_csv(path: str) -> pd.DataFrame:
+    """Load CSV, falling back from latin1 to default encoding."""
+    try:
+        return pd.read_csv(path, encoding="latin1")
+    except UnicodeDecodeError:
+        return pd.read_csv(path)
+
+
 def filter_auto(input_csv, output_csv):
     """Filter incident data and save results to a new CSV."""
-    try:
-        df = pd.read_csv(input_csv, encoding="latin1")
-    except Exception:
-        df = pd.read_csv(input_csv)
+    df = _load_csv(input_csv)
As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."

Also applies to: 116-121, 131-135, 147-151
recipes/noc-reasoning-agent/scripts/tools.py-13-20 (1)
13-20: ⚠️ Potential issue | 🟠 Major

Check_Alarm_Status is missing the _safe_tool_val wrapper, unlike every other tool function.

All other tool functions pass through _safe_tool_val() to sanitize NaN/None values, but Check_Alarm_Status returns the raw .get() result directly. If the row contains NaN for the "Check_Alarm_Status" key, .get() returns the NaN value (since the key exists), which will not be JSON-safe.
 def Check_Alarm_Status(row: pd.Series, site_or_element_id: str) -> str:
     ...
-    return row.get("Check_Alarm_Status", "NotApplicable")
+    return _safe_tool_val(row.get("Check_Alarm_Status", "NotApplicable"))
recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-30-30 (1)
30-30: ⚠️ Potential issue | 🟠 Major

Replace assert with a proper check for runtime validation.

assert statements are stripped in optimized mode (python -O), so this validation would silently disappear. Use a conditional raise instead.
-    assert len(tool_calls) == len(tool_response)
+    if len(tool_calls) != len(tool_response):
+        raise ValueError(
+            f"Mismatched tool_call/tool_response counts: {len(tool_calls)} vs {len(tool_response)}"
+        )
As per coding guidelines, "ensure errors never pass silently unless explicitly silenced."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-28-35 (1)
28-35: ⚠️ Potential issue | 🟠 Major

Silent except Exception: continue swallows all errors during resume loading.

This catches and silently ignores any exception (including unexpected ones like PermissionError, MemoryError, etc.) while parsing the output file. Per the coding guidelines, don't catch exceptions not normally expected. The batch variant (create_agent_with_tools_batch.py line 69) correctly catches only json.JSONDecodeError.
-            except Exception:
-                # ignore malformed lines so we can keep going
-                continue
+            except json.JSONDecodeError:
+                print(f"[WARNING] Skipping malformed line in output file: {line}")
As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-92-211 (1)
92-211: ⚠️ Potential issue | 🟠 Major

Heavy objects recreated on every loop iteration — significant performance waste.

llm_with_tools, MemorySaver(), and create_react_agent(...) are all created inside the per-row loop (lines 209-211). The tool closures are also redefined on every iteration (lines 102-205). Since the model and LLM wrappers don't change between rows, the agent and tool binding should be constructed once before the loop.

The tool functions can capture row via a factory function to avoid the B023 closure issue flagged by static analysis, while still being defined once.
Suggested restructuring (sketch)
+    # Build agent once outside the loop
+    checkpointer = MemorySaver()
+
     for index, row in tqdm(df.iterrows()):
         if index in processed_indices:
             continue
 
         input_question = row.get("input", "")
         print(f"Processing input {index + 1}/{len(df)}")
 
-        `@tool`
-        def Check_Alarm_Status(site_or_element_id: str) -> str:
-            ...
-        # ... all tool definitions ...
-
-        tools = [...]
-        llm_with_tools = chat_llm.bind_tools(tools)
-        checkpointer = MemorySaver()
-        react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer)
+        # Define row-bound tools using a factory or partial
+        tools = _make_tools(row)
+        llm_with_tools = chat_llm.bind_tools(tools)
+        react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer)
Note: Reusing the MemorySaver outside the loop is safe; the thread_id in config already isolates per-row state.
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-10-12 (1)
10-12: ⚠️ Potential issue | 🟠 Major

_incident_id can silently return None, corrupting downstream data.

If neither incident_identifier nor number is present, this returns None. That None is then used as a dictionary key throughout (responses[number], formatted_steps_taken[number], full_steps[number]), silently merging unrelated incidents under a single None key. This is a data integrity issue.
Proposed fix
 def _incident_id(data):
     """Synthetic schema uses incident_identifier; legacy uses number."""
-    return data.get("incident_identifier") or data.get("number")
+    incident_id = data.get("incident_identifier") or data.get("number")
+    if incident_id is None:
+        raise ValueError(f"Missing incident identifier in data: {list(data.keys())}")
+    return incident_id
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-63-69 (1)
63-69: 🛠️ Refactor suggestion | 🟠 Major

Bare Exception catch silently swallows unexpected errors.

Line 66 catches all exceptions and just prints them, masking bugs in the parsing logic. Per coding guidelines, exceptions that are not normally expected should not be caught — let them propagate so issues surface clearly. The outer JSONDecodeError catch on line 68 is fine.
Proposed fix — remove the bare catch
                 except json.JSONDecodeError as e:
                     print(text)
                     print(f"Error decoding JSON: {e}")
-                except Exception as e:
-                    print(f"An unexpected error occurred: {e}")
As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-314-323 (1)
314-323: 🛠️ Refactor suggestion | 🟠 Major

Broad except Exception silently discards incidents on any error.

This catches and suppresses all errors — including programming bugs in token_converting or data access issues — and quietly removes the affected incidents from the output. This makes debugging very difficult and violates the fail-fast principle.

Consider catching only the specific expected exception (e.g., KeyError, ValueError) or at minimum, logging the incident data that caused the failure.
Proposed fix
             try:
                 for i in range(len(steps_taken[number])):
                     if steps_taken[number][i]["tool_call"]:
                         steps_taken[number][i]["tool_call"] = token_converting(
                             steps_taken[number][i]["tool_call"], model
                         )
                     steps_taken[number][i]["thinking"] = reasoning_steps[number][steps_taken[number][i]["step_number"]]
-            except Exception as e:
-                print(e)
+            except (KeyError, ValueError) as e:
+                print(f"Error merging steps for incident {number}: {e}")
                 broken_numbers.append(number)
As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-445-448 (1)
445-448: ⚠️ Potential issue | 🟠 Major

Case C silently drops intermediate non-tool-call steps — likely a bug.

When a step has no tool_call and is not the final step, step_text is computed but never appended to current_assistant_content. The comment on line 447 states "We already added to current_assistant_content at the top of loop" but no such addition exists. This means intermediate reasoning steps are lost from the conversation context, and their thinking/actions won't appear in subsequent background fields.

If these steps should accumulate context, the content needs to be appended:
Proposed fix (if accumulation is intended)
         # --- CASE C: Intermediate Step (just accumulation) ---
         else:
-            # We already added to current_assistant_content at the top of loop
-            pass
+            # Accumulate intermediate thinking into the conversation context
+            current_assistant_content.append(
+                {"role": "assistant", "content": step_text}
+            )
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-85-87 (1)
85-87: ⚠️ Potential issue | 🟠 Major

Silent data corruption when "final<|message|>" delimiter is missing.

If text does not contain "final<|message|>", rfind returns -1, and the slice becomes text[15:] (i.e., -1 + 16), silently producing garbage data for the thinking content with no warning.
Proposed fix
-            thinking = text[text.rfind("final<|message|>") + len("final<|message|>") :]
+            delimiter = "final<|message|>"
+            idx = text.rfind(delimiter)
+            if idx == -1:
+                print(f"Warning: delimiter '{delimiter}' not found for incident {number}, step {step_number}")
+                continue
+            thinking = text[idx + len(delimiter) :]
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-476-489 (1)
476-489: 🛠️ Refactor suggestion | 🟠 Major

Same broad except Exception pattern — masks bugs during compilation.

Same issue as in merge_reasoning_steps. This silently counts programming errors as "incorrect incidents" and continues, making it hard to distinguish data issues from code bugs.
Proposed fix
-                except Exception as e:
-                    print(f"Error for incident {number}: {e}")
+                except (KeyError, ValueError) as e:
+                    print(f"Error for incident {number}: {e}", flush=True)
                     incorrect_incidents += 1
As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".

🟡 Minor comments (29)

recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-94-94 (1)
94-94: ⚠️ Potential issue | 🟡 Minor

Output key "problem code" contains a space, inconsistent with all other keys.

All other output keys use underscores (e.g., u_region2, short_description). This looks like an accidental deviation. If intentional (matching downstream expectations), please confirm; otherwise rename to "problem_code".
-                "problem code": d1.get("u_problem_code", ""),
+                "problem_code": d1.get("u_problem_code", ""),
recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-60-71 (1)
60-71: ⚠️ Potential issue | 🟡 Minor

Same .get("number") and silent-skip concern in file2 loading.

Apply the same fix as suggested for the file1 block above.
Suggested fix
                     d = json.loads(line)
-                    num = d.get("number")
-                    if num:
-                        data2[num] = d.get("generation", "")
-                except json.JSONDecodeError as e:
-                    print(f"Error decoding JSON in file2 on line {line_num}: '{line}'. Error: {e}")
+                    num = d["number"]
+                    data2[num] = d.get("generation", "")
+                except (json.JSONDecodeError, KeyError) as e:
+                    raise ValueError(f"Error processing file2 on line {line_num}: '{line}'. Error: {e}") from e
recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-46-57 (1)
46-57: ⚠️ Potential issue | 🟡 Minor

Use direct dictionary access for expected keys instead of .get().

Lines 53 and 67 use d.get("number") then silently skip records where the key is missing. If "number" is a required field in these JSONL files, direct access (d["number"]) would fail fast with a clear KeyError, which is preferable. Additionally, the JSONDecodeError catch-and-print on lines 56–57 (and 70–71) silently drops malformed lines, which can mask data corruption.
Suggested fix for file1 loading
                     d = json.loads(line)
-                    num = d.get("number")
-                    if num:
-                        data1[num] = d
-                except json.JSONDecodeError as e:
-                    print(f"Error decoding JSON in file1 on line {line_num}: '{line}'. Error: {e}")
+                    num = d["number"]
+                    data1[num] = d
+                except (json.JSONDecodeError, KeyError) as e:
+                    raise ValueError(f"Error processing file1 on line {line_num}: '{line}'. Error: {e}") from e
As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present" and "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-60-64 (1)
60-64: ⚠️ Potential issue | 🟡 Minor

Misleading section comment: says solved_category but code uses resolution_method / close_code.

Line 60 says "Bucket by solved_category" but line 63 derives the category from resolution_method or close_code, not solved_category. This will confuse future readers.
Proposed fix
-# ---------- Bucket by solved_category ----------
+# ---------- Bucket by resolution_method / close_code ----------
recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-32-44 (1)
32-44: ⚠️ Potential issue | 🟡 Minor

Use direct dictionary access for guaranteed keys; validate num is not None.

Two issues in enrich_row:

Line 33: If the row contains neither "incident_identifier" nor "number", num silently becomes None, and the lookup silently fails — producing records with None fields and no indication of a data problem. This should raise an error.

Lines 41-43: info is sourced from csv_idx (built on line 28 from the validated columns time_to_resolve, solved_category, solved_reason), so these keys are guaranteed to exist. Per coding guidelines, use direct dictionary access instead of .get().
Proposed fix
 def enrich_row(row, lookup):
-    num = row.get("incident_identifier", row.get("number"))
+    if "incident_identifier" in row:
+        num = row["incident_identifier"]
+    elif "number" in row:
+        num = row["number"]
+    else:
+        raise KeyError(f"Row missing both 'incident_identifier' and 'number': {row}")
     info = lookup.get(num)
     if info is None:
-        # No match: set to None (or choose sensible defaults)
         row["time_to_resolve"] = None
-        row["solved_category"] = row.get("solved_category")  # preserve if already present
-        row["solved_reason"] = row.get("solved_reason")
+        row["solved_category"] = row.get("solved_category")
+        row["solved_reason"] = row.get("solved_reason")
     else:
-        row["time_to_resolve"] = info.get("time_to_resolve")
-        row["solved_category"] = info.get("solved_category")
-        row["solved_reason"] = info.get("solved_reason")
+        row["time_to_resolve"] = info["time_to_resolve"]
+        row["solved_category"] = info["solved_category"]
+        row["solved_reason"] = info["solved_reason"]
     return row
As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors" and "ensure errors never pass silently unless explicitly silenced".
.gitignore-52-56 (1)
52-56: ⚠️ Potential issue | 🟡 Minor

Remove duplicate site/ entries.

Five identical site/ ignore patterns were added. Git only needs one entry per pattern, so the duplicates serve no purpose and suggest an accidental copy-paste or merge error.
🧹 Proposed fix to remove duplicates
 .idea
 site/
-site/
-site/
-site/
-site/
recipes/noc-reasoning-agent/configs/config.ini-1-10 (1)

1-10: ⚠️ Potential issue | 🟡 Minor

Config references appear stale relative to the tutorial.

The [download] section lists qwen2.5-14, qwen2.5-32, gpt-oss-120b, and nemotron-49b-1.5, but the tutorial (noc-reasoning-agent.md) trains and evaluates with Qwen3-32B — which is absent here. Similarly, original_data_path points to data/anonymized-Incidents_Last_6_Months.csv while the tutorial instructs users to place data as data/synthetic_incidents.csv. If this config is meant to be consumed by scripts in this recipe, the mismatches could confuse users or break automation.

recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-37-37 (1)

37-37: ⚠️ Potential issue | 🟡 Minor

Malformed sentence in Create_Ticket description.

"Logs and routes tickets to departments (e.g., Field Ops) via the ticketing system. It (e.g., Jira/ServiceNow)." — the second sentence is a fragment with a dangling "It" and double space. This is part of the prompt the model will see.

docs/tutorials/posts/noc-reasoning-agent.md-1-4 (1)

1-4: ⚠️ Potential issue | 🟡 Minor

Incorrect or backdated publication date.

date: 2025-09-05 predates the PR creation (Feb 2026). If this is intentional for ordering purposes, a comment would help; otherwise, update to the actual publication date.

recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-51-51 (1)

51-51: ⚠️ Potential issue | 🟡 Minor

Typo in tool name: Check_Performancel() → Check_Performance().

Extra trailing "l" in the "Degraded Prach" workflow at step 7. This text is injected into prompts and could confuse the model's tool-calling behavior.

docs/tutorials/posts/noc-reasoning-agent.md-352-365 (1)

352-365: ⚠️ Potential issue | 🟡 Minor

Missing expected evaluation results for tested models.

The tutorial says "You should see a meaningful improvement" but doesn't provide concrete accuracy numbers or expected ranges. Including representative results (e.g., baseline vs. fine-tuned close-code accuracy on a reference dataset) would make the tutorial more actionable and help users validate their setup. As per coding guidelines, "When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details".
recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-17-17 (1)
17-17: ⚠️ Potential issue | 🟡 Minor

Stray character in tool description.

Line 17 ends with a dangling "I" after the period: "...cell status). I". This will be seen by the model and may degrade prompt quality.
✏️ Proposed fix
-  - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status). I
+  - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status).
docs/tutorials/posts/noc-reasoning-agent.md-229-244 (1)
229-244: ⚠️ Potential issue | 🟡 Minor

Typo: "indicence" → "incidence".

--log_dir=/workspace/prepare-sft-data-indicence (line 231) and the equivalent at line 296 both contain the same typo.
✏️ Proposed fix
-    --log_dir=/workspace/prepare-sft-data-indicence \
+    --log_dir=/workspace/prepare-sft-data-incidence \
docs/tutorials/posts/noc-reasoning-agent.md-96-118 (1)

96-118: ⚠️ Potential issue | 🟡 Minor

In-place file overwriting risks data loss.

The second, third, and fourth filter_rows.py calls use the same path for both --input_csv and --output_csv (data/filtered_file.csv). If the script reads the CSV lazily or fails mid-write, data is lost with no backup. Consider using distinct intermediate filenames or documenting that the script fully reads before writing.
recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-44-53 (1)
44-53: ⚠️ Potential issue | 🟡 Minor

Missing guard for display columns — will crash with KeyError if columns are absent.

The metrics section (lines 20–24) defensively checks which columns exist, but columns_to_display is accessed directly without validation. If the JSONL doesn't contain one of ["expected_answer", "agent_response", "llm_judge_reason"], this will raise an unguarded KeyError.
🛡️ Proposed fix
 columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"]
+available_display = [c for c in columns_to_display if c in df.columns]
+if not available_display:
+    print("No display columns found; skipping sample output.")
+else:
-df_subset = df[columns_to_display].head(10)
+    df_subset = df[available_display].head(10)
recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-58-62 (1)

58-62: ⚠️ Potential issue | 🟡 Minor

Debug prints left in production code.

Lines 59 (print(row["response"])), 60 (# exit()), and 62 (print("in")) are debug artifacts that will produce excessive output and should be removed.

recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-48-48 (1)

48-48: ⚠️ Potential issue | 🟡 Minor

Close code list contains both "Reset RU" and "RU Reset".

These appear to be duplicates with reversed word order. This could confuse the model during inference. Consider keeping only one canonical form.

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-144-144 (1)

144-144: ⚠️ Potential issue | 🟡 Minor

Debug print(required_incidents) left in.

This prints the full list of required incident IDs to stdout, which is noisy and potentially leaks data in production pipelines.
recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-148-149 (1)
148-149: ⚠️ Potential issue | 🟡 Minor

Misleading success message.

Line 149 prints "Evaluations saved to evaluations.jsonl" regardless of the actual output path. Use the actual path:
🐛 Fix
-print("Evaluations saved to evaluations.jsonl")
+print(f"Evaluations saved to {args.output_file}")
recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-96-99 (1)

96-99: ⚠️ Potential issue | 🟡 Minor

--preview argument is parsed but never used.

The preview argument (line 97) is accepted but has no effect. Either implement the preview functionality or remove the argument.

As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/filtering/match_keywords.py-254-261 (1)
254-261: ⚠️ Potential issue | 🟡 Minor

Unused pattern parameter – compiled regex patterns are dead code.

find_keyword_matches accepts a pattern argument (the compiled regex) but never uses it. The regex patterns compiled on lines 281-282 are passed into find_keyword_matches on lines 290 and 296 but silently discarded. The function only performs an exact in check against the keywords list.

Either remove the pattern parameter and the regex compilation, or refactor the function to actually use the regex for substring matching (which would change semantics — matching keywords within a field vs. exact match).
Option A: Remove unused pattern (keeps current exact-match behavior)
-def find_keyword_matches(row, pattern, keywords):
+def find_keyword_matches(row, keywords):
     """Finds which specific keywords from a list match within a DataFrame row."""
     # Synthetic schema uses resolution_method (same meaning as close_code)
     resolution_method = row.get("resolution_method", row.get("close_code", ""))
     if resolution_method and str(resolution_method).lower().replace(" ", "") in keywords:
         return True
 
     return False
And update the call sites:
-    hw_pattern = re.compile("|".join(re.escape(k) for k in hw_keywords), flags=re.IGNORECASE)
-    sw_pattern = re.compile("|".join(re.escape(k) for k in sw_keywords), flags=re.IGNORECASE)
-
     ...
-        hw_matches = find_keyword_matches(row, hw_pattern, hw_keywords)
+        hw_matches = find_keyword_matches(row, hw_keywords)
         ...
-        sw_matches = find_keyword_matches(row, sw_pattern, sw_keywords)
+        sw_matches = find_keyword_matches(row, sw_keywords)
Also applies to: 280-296
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-46-46 (1)
46-46: ⚠️ Potential issue | 🟡 Minor

Missing required=True on --input argument.

--input has no default and no required=True, so args.input will be None if omitted, causing a TypeError on Path(args.input) at line 59.
-    parser.add_argument("--input", help="Path to the JSONL file containing inputs and answers")
+    parser.add_argument("--input", required=True, help="Path to the JSONL file containing inputs and answers")
As per coding guidelines, "the code should fail if a required argument is not specified."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py-126-126 (1)
126-126: ⚠️ Potential issue | 🟡 Minor

--limit argument lacks type=int, will fail with a confusing error on non-numeric input.

args.limit is a string by default. Line 141 does int(args.limit), which raises ValueError for non-numeric input. Declare the type on the argument:
-    parser.add_argument("--limit", required=False)
+    parser.add_argument("--limit", type=int, required=False, default=None)
recipes/noc-reasoning-agent/configs/noc_reasoning_sft.yaml-58-58 (1)
58-58: ⚠️ Potential issue | 🟡 Minor

Comment contradicts the value: remove_no_think_tags: false but comment says "Enabled, as requested".

The value is false (disabled), but the comment reads "Enabled, as requested". This is confusing and likely a copy-paste error. If the filter should be enabled, change to true; otherwise, fix the comment.
-  remove_no_think_tags: false         # Enabled, as requested
+  remove_no_think_tags: true          # Enabled, as requested
recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-180-188 (1)
180-188: ⚠️ Potential issue | 🟡 Minor

Invalid filter_type silently does nothing.

If the user passes --filter_type=foo, none of the elif branches match and the script exits silently with no output. Per coding guidelines, unsupported arguments should cause a failure.
Suggested fix
     elif args.filter_type == "close_codes":
         filter_close_codes(args.input_csv, args.output_csv)
+    else:
+        parser.error(f"Unknown filter_type: {args.filter_type!r}. Choose from: auto, soft_solve, problem_codes, close_codes")
Or better, use choices on the argument itself:
-    parser.add_argument("--filter_type", type=str, default="auto")
+    parser.add_argument("--filter_type", type=str, default="auto",
+                         choices=["auto", "soft_solve", "problem_codes", "close_codes"])
As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-22-52 (1)
22-52: ⚠️ Potential issue | 🟡 Minor

get_tools has an inconsistent return type: tuple (None, None) vs. dict.

On the error path (line 50), the function returns (None, None) (a 2-tuple), but on the success path (line 52), it returns matches (a dict). The caller on line 73 assigns the result to a single variable and compares with == (None, None), which works incidentally but is fragile and confusing.

Return None on the empty case and check if matches is None at the call site.
Suggested fix
     if not matches:
-        # print("No tools!")
-        return None, None
-    # print(matches)
+        return None
     return matches
And update the caller:
-                        matches = get_tools(d["initial_background"])
-                        if matches == (None, None):
+                        matches = get_tools(d["initial_background"])
+                        if matches is None:
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-130-130 (1)
130-130: ⚠️ Potential issue | 🟡 Minor

Typo in user-facing message: "disgarded" → "discarded".
-    print(f"{incorrect_incidents} incidents were not parsed correctly and disgarded.")
+    print(f"{incorrect_incidents} incidents were not parsed correctly and discarded.")
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-46-52 (1)
46-52: ⚠️ Potential issue | 🟡 Minor

Off-by-one in end_index check: ]-not-found case passes the guard.

When "]" is absent, rfind returns -1, so end_index becomes 0 after the + 1. The guard end_index != -1 on line 52 evaluates to True, allowing a nonsensical slice text[start_index:0] (empty string) that falls through to a JSONDecodeError instead of the intended error message on line 61.
Proposed fix
-                    end_index = text.rfind("]") + 1
+                    end_index_raw = text.rfind("]")
+                    end_index = end_index_raw + 1

-                    if start_index != -1 and end_index != -1:
+                    if start_index != -1 and end_index_raw != -1:
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-505-511 (1)
505-511: ⚠️ Potential issue | 🟡 Minor

Missing validation: output_dir and output_file can be None for their respective modes.

When parse_types == "compile_reasoning", output_dir is passed to compile_reasoning which calls os.makedirs(output_dir, ...) — this crashes with TypeError if None. Similarly, output_file can be None for steps_extraction. Add upfront validation like the existing check for reasoning_jsonl.
Proposed fix
 def main(jsonl_file, input_file, output_file, parse_types, reasoning_jsonl=None, output_dir=None):
     if parse_types == "steps_extraction":
+        if not output_file:
+            raise ValueError("Please specify an output file by specifying --output")
         prepare_data_for_reasoning_traces(jsonl_file, input_file, output_file)
     elif parse_types == "compile_reasoning":
         if not reasoning_jsonl:
             raise ValueError("Please specify a reasoning jsonl file by specifying --reasoning_jsonl")
+        if not output_dir:
+            raise ValueError("Please specify an output directory by specifying --output_dir")
         compile_reasoning(jsonl_file, input_file, output_dir, reasoning_jsonl)

coderabbitai · 2026-02-13T03:59:21Z

recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py

+    skip = True
+    if skip and os.path.exists(output_path):
+        os.remove(output_path)
+    processed_indices = load_processed_indices(output_path)


⚠️ Potential issue | 🔴 Critical

Hardcoded skip = True deletes the output file on every run, completely breaking resume logic.

Lines 132-134 unconditionally delete the output file before load_processed_indices is called on line 135. This means the resume logic (lines 57-71, 144-148) will never find any previously processed rows — the file is always gone. All the resume infrastructure becomes dead code.

This looks like a debug artifact that was accidentally left in.

- skip = True - if skip and os.path.exists(output_path): - os.remove(output_path) processed_indices = load_processed_indices(output_path)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

skip = True

if skip and os.path.exists(output_path):

os.remove(output_path)

processed_indices = load_processed_indices(output_path)

processed_indices = load_processed_indices(output_path)

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around lines 132 - 135, The code currently hardcodes skip = True which causes deletion of output_path every run and breaks resume logic; change this so skip is not hardcoded—make it a CLI flag, environment variable, or default False—and only call os.remove(output_path) when skip is explicitly set by the user; ensure the block around skip, os.path.exists(output_path), and os.remove(output_path) is gated by the configurable skip variable and leave load_processed_indices(output_path) unchanged so resume logic (load_processed_indices / processed_indices) can work when skip is False.

coderabbitai · 2026-02-13T03:59:21Z

recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py

+    model = AutoModelForCausalLM.from_pretrained(
+        args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find and read the batch variant file fd "create_agent_with_tools_batch.py" --type f

Repository: NVIDIA-NeMo/Skills

Length of output: 130

🏁 Script executed:

# Find the non-batch variant for comparison fd "create_agent_with_tools.py" --type f

Repository: NVIDIA-NeMo/Skills

Length of output: 124

🌐 Web query:

AutoModelForCausalLM.from_pretrained torch_dtype parameter transformers library

💡 Result:

AutoModelForCausalLM.from_pretrained(..., torch_dtype=...) (Hugging Face Transformers) controls the PyTorch floating-point dtype the model is instantiated/loaded into, which can significantly reduce memory usage for large LMs.

Common values:

A specific dtype (e.g. torch.float16, torch.bfloat16, torch.float32) to force loading in that dtype. Example: AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16).[1]

"auto" to automatically choose a dtype based on the checkpoint/model metadata (the docs describe it as deriving dtype automatically, commonly from the weights / config where available). Example: AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto").[2]

Notes:

This is only available for floating dtypes (per the model docs’ “Model Instantiation dtype” section).[1]

It’s often paired with device_map="auto" when loading big models to fit across available devices.[2]

Sources:
[1] Transformers “Models” / “Model Instantiation dtype” documentation (examples using torch_dtype=torch.float16 and "auto", and noting floating dtypes only). (huggingface.co)
[2] Transformers “Loading models” guide showing AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto"). (huggingface.co)

Citations:

1: https://huggingface.co/docs/transformers/v4.15.0/en/main_classes/model?utm_source=openai

2: https://huggingface.co/docs/transformers/v4.53.1/models?utm_source=openai

🏁 Script executed:

cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py | sed -n '150,165p'

Repository: NVIDIA-NeMo/Skills

Length of output: 921

🏁 Script executed:

cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py | sed -n '70,80p'

Repository: NVIDIA-NeMo/Skills

Length of output: 499

Incorrect parameter name: dtype= should be torch_dtype=.

AutoModelForCausalLM.from_pretrained() uses torch_dtype as the parameter name, not dtype. Using dtype will silently ignore the parameter, and the model will load in the default precision (float32) instead of the intended bfloat16/float16, causing significant memory and performance degradation.

The non-batch variant (create_agent_with_tools.py line 74) correctly uses torch_dtype=torch.float16.

Fix

model = AutoModelForCausalLM.from_pretrained( - args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True + args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model = AutoModelForCausalLM.from_pretrained(

args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True

model = AutoModelForCausalLM.from_pretrained(

args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True

)

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around lines 156 - 157, The call to AutoModelForCausalLM.from_pretrained in create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so the intended precision is ignored; update the call to pass the dtype via torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified bfloat16/float16 precision, ensuring you modify the AutoModelForCausalLM.from_pretrained(...) invocation accordingly.

coderabbitai · 2026-02-13T03:59:21Z

recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py

+# Parse arguments for input JSONL path
+parser = argparse.ArgumentParser(description="Evaluation Pipeline for Agent Responses")
+parser.add_argument("input_jsonl", help="Path to agent_responses.jsonl containing expected_answer and agent_response")
+parser.add_argument("--output_file", help="Path to output")


⚠️ Potential issue | 🔴 Critical

--output_file is neither required nor has a default — will crash at line 148.

If --output_file is omitted, args.output_file is None, and pd.DataFrame.to_json(None, ...) will raise a ValueError. Add required=True or a sensible default.

🐛 Proposed fix

- parser.add_argument("--output_file", help="Path to output") + parser.add_argument("--output_file", required=True, help="Path to output JSONL")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

parser.add_argument("--output_file", help="Path to output")

parser.add_argument("--output_file", required=True, help="Path to output JSONL")

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py` at line 12, The parser.add_argument("--output_file", help="Path to output") can leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file, ...) to raise a ValueError; update the argument declaration (parser.add_argument) to either set required=True or provide a sensible default (e.g., default="evaluation_results.json"), and/or add a guard before writing that validates args.output_file is not None and raises a clear error message; ensure references to args.output_file and the DataFrame.to_json call are updated accordingly.

coderabbitai · 2026-02-13T03:59:22Z

recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py

+    m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL)
+
+    if not m:
+        # Handle case with no arguments, e.g., Check_Time[]
+        m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string))
+        if m_no_args:
+            tool_name = m_no_args.group(1)
+            raw_args = ""
+        else:
+            # If it doesn't match the syntax, return original string or raise error
+            # returning string allows the LLM to fail gracefully or retry
+            return string
+
+    tool_name, raw_args = m.groups()


⚠️ Potential issue | 🔴 Critical

Unreachable branch + latent crash: m.groups() on None if the dead-code path were taken.

The first regex \[(.*)\] with re.DOTALL already matches empty brackets Tool[], making the m_no_args branch (lines 157-160) dead code. More critically, if it were reached (e.g., due to a future regex change), execution falls through to line 166 which calls m.groups() on None — an AttributeError.

Restructure to eliminate the dead path and the latent crash:

Proposed fix

m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL) if not m: - # Handle case with no arguments, e.g., Check_Time[] - m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string)) - if m_no_args: - tool_name = m_no_args.group(1) - raw_args = "" - else: - # If it doesn't match the syntax, return original string or raise error - # returning string allows the LLM to fail gracefully or retry - return string - - tool_name, raw_args = m.groups() + return string + + tool_name, raw_args = m.groups()

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py` around lines 153 - 166, The code currently uses two regex matches (m and m_no_args) and then unconditionally calls m.groups(), which creates dead code and a latent crash; instead, after performing the initial match m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is truthy and if not immediately return the original string (or handle the error), and only then call m.groups() to assign tool_name and raw_args; remove the separate m_no_args branch entirely so empty brackets are handled by the first regex and m.groups() is always called on a valid match.

coderabbitai · 2026-02-13T03:59:22Z

recipes/noc-reasoning-agent/scripts/utils/token_usage.py

+if __name__ == "__main__":
+    # === Configuration ===
+    # csv_file = "data/top_10_categorized_incidents.csv"  # Path to your CSV
+    csv_file = "data/prompt_incident.yaml"
+    column_name = "action_chronicle"  # default; overridden after loading data
+    model = "openai/gpt-oss-120b"  # Tokenizer model
+    output_csv = "outputs/work_notes_token_lengths.csv"
+    output_plot = "outputs/work_notes_token_hist.png"
+
+    # === Load Tokenizer ===
+    tokenizer = AutoTokenizer.from_pretrained(model)
+    print(f"[info] Tokenizer loaded for model: {model}")
+
+    if csv_file.endswith((".yaml", ".yml")):
+        with open(csv_file, "r", encoding="utf-8") as f:
+            yaml_data = yaml.safe_load(f)
+
+        # You might want to join everything into one big string, or tokenize each value
+        if isinstance(yaml_data, dict):
+            prompt_text = yaml.dump(yaml_data)
+        else:
+            prompt_text = str(yaml_data)
+
+        token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False))
+        print(f"[info] YAML file {csv_file} contains {token_count} tokens.")
+        exit(0)
+
+    # === Load CSV ===
+    df = pd.read_csv(csv_file, encoding="utf-8")
+    if column_name not in df.columns:
+        raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}")
+
+    # === Count Tokens ===
+    token_lengths = []
+    zero_tokens = 0
+    for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"):
+        if pd.isna(text):  # Handle NaN values
+            token_lengths.append(0)
+            zero_tokens += 1
+            continue
+        # Count tokens for this row
+        token_count = len(tokenizer.encode(str(text), add_special_tokens=False))
+        token_lengths.append(token_count)
+
+    # Count rows with 0 tokens
+    print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}")
+
+    # Add results back to DataFrame
+    df["work_notes_token_length"] = token_lengths
+
+    # === Save CSV with token counts ===
+    df.to_csv(output_csv, index=False)
+    print(f"[info] Token lengths saved to {output_csv}")
+
+    # === Summary Statistics ===
+    print("\n=== Token Length Summary ===")
+    print(df["work_notes_token_length"].describe())
+
+    # === Plot Histogram ===
+    plt.figure(figsize=(10, 6))
+    plt.hist(df["work_notes_token_length"], bins=50)
+    plt.title(f"Distribution of Token Lengths for {column_name}")
+    plt.xlabel("Token Length")
+    plt.ylabel("Frequency")
+    plt.tight_layout()
+    plt.savefig(output_plot, dpi=150)
+    plt.close()
+    print(f"[info] Histogram saved to {output_plot}")
+    exit()
+    main()


⚠️ Potential issue | 🔴 Critical

main() is unreachable — debug code left in __main__ block.

exit() on line 275 terminates the process before main() on line 276 is ever called. The entire __main__ block (lines 207–275) appears to be leftover development code with hardcoded paths (data/prompt_incident.yaml, openai/gpt-oss-120b) and duplicated logic. It also bypasses the defensive AutoTokenizer import guard — line 217 will crash with TypeError if transformers is not installed.

Replace the block with a clean entry point:

🐛 Proposed fix

if __name__ == "__main__": - # === Configuration === - # csv_file = "data/top_10_categorized_incidents.csv" # Path to your CSV - csv_file = "data/prompt_incident.yaml" - column_name = "action_chronicle" # default; overridden after loading data - model = "openai/gpt-oss-120b" # Tokenizer model - output_csv = "outputs/work_notes_token_lengths.csv" - output_plot = "outputs/work_notes_token_hist.png" - - # === Load Tokenizer === - tokenizer = AutoTokenizer.from_pretrained(model) - print(f"[info] Tokenizer loaded for model: {model}") - - if csv_file.endswith((".yaml", ".yml")): - with open(csv_file, "r", encoding="utf-8") as f: - yaml_data = yaml.safe_load(f) - - # You might want to join everything into one big string, or tokenize each value - if isinstance(yaml_data, dict): - prompt_text = yaml.dump(yaml_data) - else: - prompt_text = str(yaml_data) - - token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False)) - print(f"[info] YAML file {csv_file} contains {token_count} tokens.") - exit(0) - - # === Load CSV === - df = pd.read_csv(csv_file, encoding="utf-8") - if column_name not in df.columns: - raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}") - - # === Count Tokens === - token_lengths = [] - zero_tokens = 0 - for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"): - if pd.isna(text): # Handle NaN values - token_lengths.append(0) - zero_tokens += 1 - continue - # Count tokens for this row - token_count = len(tokenizer.encode(str(text), add_special_tokens=False)) - token_lengths.append(token_count) - - # Count rows with 0 tokens - print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}") - - # Add results back to DataFrame - df["work_notes_token_length"] = token_lengths - - # === Save CSV with token counts === - df.to_csv(output_csv, index=False) - print(f"[info] Token lengths saved to {output_csv}") - - # === Summary Statistics === - print("\n=== Token Length Summary ===") - print(df["work_notes_token_length"].describe()) - - # === Plot Histogram === - plt.figure(figsize=(10, 6)) - plt.hist(df["work_notes_token_length"], bins=50) - plt.title(f"Distribution of Token Lengths for {column_name}") - plt.xlabel("Token Length") - plt.ylabel("Frequency") - plt.tight_layout() - plt.savefig(output_plot, dpi=150) - plt.close() - print(f"[info] Histogram saved to {output_plot}") - exit() - main() + main()

🧰 Tools

🪛 Ruff (0.15.0)

[warning] 237-237: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 242-242: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py` around lines 207 - 276, The __main__ block contains leftover debug logic that hardcodes csv_file/model, calls AutoTokenizer directly (unsafe if transformers missing), and calls exit() before main(), making main() unreachable; remove or replace the entire if __name__ == "__main__": block so it simply parses CLI/default args and invokes main(), ensure any tokenizer creation uses the existing guarded import/creation logic (e.g., reuse the module's AutoTokenizer-safe factory or wrap AutoTokenizer.from_pretrained in the same try/except used elsewhere), eliminate hardcoded paths and stray exit() calls, and wire csv_file, column_name, model, output_csv, output_plot into main() parameters or argparse so the script runs via main() correctly.

coderabbitai · 2026-02-13T03:59:22Z

recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py

+def parse_final_reasoning(generation_text):
+    """
+    Parses the 'generation' field to extract only the final
+    Thought, Action, and Observation steps.
+    """
+    # 1. Isolate the final, clean reasoning trace
+    # This regex looks for the block starting with "Question:" and ending with "Finish[...]"
+    trace_block_match = re.search(
+        r"Question:.*?(Finish\[.*?\])",
+        generation_text,
+        re.DOTALL,  # Allows '.' to match newlines
+    )
+
+    # Find the last occurrence of "Finish"
+    last_index = generation_text.rfind("Finish")
+
+    if last_index != -1:
+        finish_text = generation_text[last_index:].strip()
+        print(finish_text)
+    else:
+        print("No 'Finish' found!")
+
+    return finish_text
+
+    if not trace_block_match:
+        return "Final reasoning trace not found."
+
+    reasoning_trace = trace_block_match.group(0)
+
+    # 2. Extract each individual Thought, Action, and Observation
+    # This regex finds all lines starting with the keywords
+    step_pattern = re.compile(
+        r"^(Thought|Action|Observation)\s+\d+:\s*(.*)$",
+        re.MULTILINE,  # Allows '^' to match the start of each line
+    )
+
+    steps = step_pattern.findall(reasoning_trace)
+
+    # 3. Format the results for clarity
+    parsed_steps = []
+    for step in steps:
+        kind = step[0]  # "Thought", "Action", or "Observation"
+        content = step[1]  # The text of the step
+        parsed_steps.append(f"**{kind}:** {content}")
+
+    return "\n".join(parsed_steps)


⚠️ Potential issue | 🔴 Critical

Dead code: lines 50–71 are unreachable after return on line 48.

parse_final_reasoning always returns at line 48 (or crashes if last_index == -1 due to the same unbound-variable pattern). Everything from line 50 onward is dead code. Additionally, lines 44 and 46 contain debug print statements that should be removed.

Either remove the dead code or restructure the function so the intended logic is reachable.

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py` around lines 26 - 71, The function parse_final_reasoning contains an early return and debug prints that make the intended parsing logic unreachable; remove the premature return and the print statements and restructure the function to (1) first find the final "Finish" block using trace_block_match or last_index as a fallback, (2) set finish_text deterministically (return a clear error string if neither is found), (3) run the step_pattern regex against that finish_text to extract Thought/Action/Observation entries, format them into parsed_steps, and finally return the joined string; refer to parse_final_reasoning, trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when making these changes.

coderabbitai · 2026-02-13T03:59:22Z

recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py

+def find_finish_action(generation_text):
+    # Regex to capture everything inside Finish[ ... ]
+    # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL)
+
+    # finish_action = match.group(1).strip()
+    # return finish_action
+    last_index = generation_text.rfind("Finish")
+
+    if last_index != -1:
+        finish_text = generation_text[last_index:].strip()
+        # print(finish_text)
+    else:
+        print("No 'Finish' found!")
+
+    return finish_text


⚠️ Potential issue | 🔴 Critical

UnboundLocalError when "Finish" is not found.

When generation_text doesn't contain "Finish", finish_text is never assigned, but line 88 returns it unconditionally. This will crash at runtime.

🐛 Proposed fix

def find_finish_action(generation_text): - # Regex to capture everything inside Finish[ ... ] - # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL) - - # finish_action = match.group(1).strip() - # return finish_action last_index = generation_text.rfind("Finish") - if last_index != -1: - finish_text = generation_text[last_index:].strip() - # print(finish_text) + return generation_text[last_index:].strip() else: - print("No 'Finish' found!") - - return finish_text + return ""

🤖 Prompt for AI Agents

In `@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py` around lines 74 - 88, The function find_finish_action assigns finish_text only when last_index != -1 but always returns finish_text, causing UnboundLocalError when "Finish" isn't found; update find_finish_action to handle the not-found case by returning a safe default (e.g., None or empty string) or raising a clear exception, and ensure finish_text is defined in both branches (use last_index, generation_text, and finish_text to locate/construct the return value or the fallback).

gwarmstrong

Thanks for putting this together! A handful of things to fix to match our existing tutorial conventions.

gwarmstrong · 2026-02-13T21:50:29Z

.gitignore

 .codex

 .idea
+site/


Looks like site/ got added 5 times here — just needs the one entry.

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+---
+date: 2025-09-05
+readtime: 30
+---


Our other tutorials all hide the table of contents. This is a pretty long tutorial so it may make sense to have it anyway. However maybe consider if all of your heading levels are correct. For example, "Phase 1:" and "Parse and Format Steps" are the same heading level but it looks disjoint because there's a Phase 1 and Phase 2 that are more than likely the same level.

If you decide to remove the table of contents altogether, you can add:

hide: - toc

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+
+```shell
+pip install git+https://github.com/NVIDIA-NeMo/Skills.git
+ns setup


Nit: our convention is ```bash for shell commands (not ```shell). Applies to all ~16 code blocks in the tutorial.

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+```
+
+When prompted during `ns setup`, define a working folder as `/workspace` (e.g. mount `/your/project/dir:/workspace`). This folder will be used in all subsequent commands. For more details, see the [Nemo-Skills configs](https://nvidia-nemo.github.io/Skills/basics/cluster-configs/) documentation.
+


Links to our own docs should use relative paths so they work with mkdocs serve and don't break if the site URL changes. Applies in a few places:

[Nemo-Skills configs](../../basics/cluster-configs.md)

[generation](../../pipelines/generation.md)

[Nemo-Skills training](../../pipelines/training.md)

[start-server pipeline](../../pipelines/start-server.md)

[Python API](../../pipelines/index.md#python-interface)

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+# Teaching a Model to Reason Over Telecom Network Incidents
+
+This tutorial walks you through a complete pipeline for fine-tuning a reasoning model that can autonomously diagnose and resolve telecom network incidents. Using [Nemo-Skills](https://nvidia-nemo.github.io/Skills/) together with a [NoC Reasoning Agent](https://github.com/aiden200/NoC_Reasoning_Agent), we will take [Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) and teach it to perform step-by-step root-cause analysis with tool-calling — the same workflow a human NOC (Network Operations Center) engineer follows today.
+


External links (HuggingFace, GitHub repos, etc.) should open in a new tab. Just append {target="_blank"} — e.g.:

[Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B){target="_blank"}

Applies to all external links: Qwen3-32B, gpt-oss-120b, NeMo-RL, NoC Reasoning Agent repo, etc.

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+    --log_dir=/workspace/prepare-sft-data-indicence \
+    --expname=prep-sft-data-inci \
+    --run_after=solution-generation \
+    --cluster=local \


The ns run_cmd blocks for preparing SFT data and test data both use --run_after=solution-generation, but there's no prior command with --expname=solution-generation. The ns generate commands use --expname=gpt-oss-sdg-with-python.

gwarmstrong · 2026-02-13T21:50:29Z

docs/tutorials/posts/noc-reasoning-agent.md

+
+```shell
+ns run_cmd \
+    --log_dir=/workspace/prepare-sft-data-indicence \


Small typo — "indicence", I assume it should be "incidence" or "incidents". Shows up in both the SFT data and test data --log_dir values (lines 231, 296).

gwarmstrong · 2026-02-13T21:50:29Z

recipes/noc-reasoning-agent/.gitignore

@@ -0,0 +1,214 @@
+# Byte-compiled / optimized / DLL files


This is 214 lines and looks like the full GitHub Python .gitignore template — includes entries for Django, Flask, Scrapy, SageMath, Celery, Marimo, etc. The root repo already has a comprehensive .gitignore.

Was this intentionally included for the recipe, or did it get auto-generated? I would think this could get entirely removed.

gwarmstrong · 2026-02-13T21:50:30Z

docs/tutorials/posts/noc-reasoning-agent.md

+
+```shell
+ns run_cmd --expname=prepare-noc --log_dir=/workspace/prepare-noc --cluster=local \
+    'cd /workspace && \


The setup downloads 16 files via wget and copies them into /workspace. But since these are all read-only (scripts, prompts, configs), do they actually need to be copied? The recipe files already live in the repo under recipes/noc-reasoning-agent/, so the tutorial commands could just reference them directly from there.

If the user has nemo-skills cloned, these files are already at /nemo_run/code/recipes/noc-reasoning-agent/ inside the container (see code packaging docs). So the ns generate commands could point to e.g. ++prompt_config=/nemo_run/code/recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml without any setup step at all.

For the case where someone wants to customize a prompt, the prompt format docs already explain how ++prompt_config= works with custom YAML files — maybe just link to that instead of having the user copy everything upfront.

This would eliminate the entire ns run_cmd setup block and make the tutorial much easier to follow.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 286-365: Update the "Evaluation" section to include concrete
expected metrics and any dataset-specific notes: add a short table or bullet
stating observed baseline vs fine-tuned close-code accuracy (e.g., "baseline X%
vs fine-tuned Y% on outputs/final_agent_input.jsonl"), mention which evaluation
script is used (problem_code_evaluation.py) and which output files correspond to
each run (outputs/agent_responses.jsonl and
outputs/baseline_agent_responses.jsonl), and note any special preparation or
non-standard inference flags required (e.g., tokenizer, filters.trim_solutions,
or custom weights_dir) so readers can reproduce and validate the reported
numbers.

coderabbitai · 2026-02-17T05:01:34Z

docs/tutorials/posts/noc-reasoning-agent.md

+## Evaluation
+
+To evaluate the fine-tuned model, we use a ReAct (Reasoning + Acting) agent that calls NOC tools at each step, then compare its incident resolution accuracy against the baseline model.
+
+### Prepare Test Data
+
+Prepare the test set in the same format as training:
+
+```shell
+ns run_cmd \
+    --log_dir=/workspace/prepare-test-data-indicence \
+    --expname=prep-test-data-inci \
+    --run_after=solution-generation \
+    --cluster=local \
+    'python -m nemo_skills.training.prepare_data \
+        --config-path /workspace/data \
+        --config-name noc_reasoning_sft \
+        input_files=/workspace/outputs/testing_data_split.jsonl \
+        output_path=/workspace/outputs/sft-test-incidence.jsonl \
+        prompt_config=/workspace/data/prompts/prompt_incident.yaml \
+        tokenizer=Qwen/Qwen3-32B \
+        filters.remove_contaminated=false \
+        add_unlabeled=true \
+        filters.trim_solutions=false'
+```
+
+### Build Agent Input
+
+Create the ReAct agent input file containing incident prompts with tool response data:
+
+```shell
+python src/ns_pipelines/prepare_react_agent.py \
+    outputs/testing_data_split.jsonl \
+    outputs/sft-test-incidence.jsonl \
+    --output outputs/final_agent_input.jsonl \
+    --prompt_config data/prompts/prompt_incident.yaml
+```
+
+### Install Agent Dependencies
+
+Install the additional libraries needed for the ReAct agent:
+
+```shell
+pip install --upgrade langgraph langchain langchain-huggingface transformers torch accelerate pandas
+```
+
+### Run the Fine-Tuned Agent
+
+```shell
+PYTHONPATH=$PWD python src/create_agent_with_tools_batch.py \
+    --input outputs/final_agent_input.jsonl \
+    --output outputs/agent_responses.jsonl \
+    --weights_dir training/qwen3-32b-improved-hf
+```
+
+### Run the Baseline Agent
+
+For comparison, run the same evaluation using the original (non-fine-tuned) model:
+
+```shell
+PYTHONPATH=$PWD python src/create_agent_with_tools_batch.py \
+    --input outputs/final_agent_input.jsonl \
+    --output outputs/baseline_agent_responses.jsonl \
+    --weights_dir Qwen/Qwen3-32B
+```
+
+### Compare Results
+
+Evaluate both models by computing close-code accuracy (how often the model selects the correct resolution method):
+
+```shell
+# Fine-tuned model
+python src/evaluation/problem_code_evaluation.py outputs/agent_responses.jsonl
+
+# Baseline model
+python src/evaluation/problem_code_evaluation.py outputs/baseline_agent_responses.jsonl
+```
+
+The evaluation script matches the model's predicted close code against the expected answer using synonym-aware matching (e.g. "Resolved" and "Issue Corrected" are both recognized). You should see a meaningful improvement in the fine-tuned model's accuracy compared to the baseline.
+


⚠️ Potential issue | 🟡 Minor

Add concrete expected results for the evaluation section.

The guideline requires expected results for tested models. Please include target/observed metrics (e.g., baseline close‑code accuracy vs. fine‑tuned accuracy on the provided dataset) so readers can validate runs. As per coding guidelines, “When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details like special preparation arguments or non-standard inference arguments.”

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 286 - 365, Update the "Evaluation" section to include concrete expected metrics and any dataset-specific notes: add a short table or bullet stating observed baseline vs fine-tuned close-code accuracy (e.g., "baseline X% vs fine-tuned Y% on outputs/final_agent_input.jsonl"), mention which evaluation script is used (problem_code_evaluation.py) and which output files correspond to each run (outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl), and note any special preparation or non-standard inference flags required (e.g., tokenizer, filters.trim_solutions, or custom weights_dir) so readers can reproduce and validate the reported numbers.

…ripts Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (4)

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py (4)

82-82: --limit should use type=int in argparse.

--limit is declared without a type, so it arrives as a string. Line 150 manually casts with int(args.limit), which will raise an unhelpful ValueError on non-numeric input. Using type=int in argparse gives the user a proper error message at parse time.

Proposed fix

-    parser.add_argument("--limit", required=False, help="Max number of rows to export")
+    parser.add_argument("--limit", required=False, type=int, help="Max number of rows to export")

Then simplify the usage:

-    if args.limit is not None and int(args.limit) > 0:
-        df = df.head(int(args.limit))
+    if args.limit is not None and args.limit > 0:
+        df = df.head(args.limit)

Also applies to: 149-151

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
at line 82, Change the argparse declaration for --limit to accept integers by
adding type=int (and optionally default=None) so argparse validates numeric
input; then remove the manual cast int(args.limit) and use args.limit directly
where used (e.g., in the code that slices or limits rows), updating references
in this module (create_input_jsonl_from_incidents.py) to use the parsed integer
value instead of re-casting a string.

112-118: Broad except Exception on datetime parsing masks real errors.

While datetime parsing on messy data can legitimately fail, catching all Exception types is too broad (e.g., it would hide MemoryError or KeyboardInterrupt). Consider catching (ValueError, TypeError) specifically, and logging the exception type. Ruff also flags this (BLE001).

Proposed fix

-    except Exception as e:
-        print(f"Failed to parse datetime columns {start_col}, {end_col}: {e}")
+    except (ValueError, TypeError, pd.errors.ParserError) as e:
+        print(f"WARNING: Failed to parse datetime columns {start_col}, {end_col}: {e}", file=sys.stderr)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 112 - 118, The try/except around the datetime conversion is too
broad and should not catch all Exception types; update the block that converts
df[start_col] and df[end_col] and computes df["time_to_resolve"] (in
create_input_jsonl_from_incidents.py) to catch only ValueError and TypeError,
and ensure the logged message includes the exception type and message (instead
of a bare except Exception). Keep the fallback that sets df["time_to_resolve"] =
0 on those parse errors and preserve the existing log text/context while adding
the exception type information.

69-75: Missing type hints on function parameters.

Per coding guidelines, use type hints for simple types. Both df and num have clear types.

Proposed fix

-def extract_examples_by_problem_code(df, num):
+def extract_examples_by_problem_code(df: pd.DataFrame, num: int) -> pd.DataFrame:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 69 - 75, The function extract_examples_by_problem_code should
declare types: change its signature to accept df: pandas.DataFrame
(pd.DataFrame) and num: int and return pd.DataFrame; update the signature of
extract_examples_by_problem_code to include these type hints and add the return
annotation -> pd.DataFrame so linters and type checkers know the expected types
(ensure pd is imported from pandas where the file already uses it).

9-14: try/except ImportError for sibling module resolution is fragile.

This pattern silently falls back to src.utils.* when a relative import fails, coupling the script to a specific directory layout. Consider documenting the expected PYTHONPATH setup or using a single import style.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`
around lines 9 - 14, The try/except ImportError fall-back between relative and
absolute imports is fragile; replace the dual import block with a single,
explicit import style (e.g., use absolute imports or a consistent relative
import) for get_reasoning_process_for_fault_category, INCIDENT_ID_COLUMN, and
REQUIRED_COLUMNS so the module resolution is deterministic, and add a short
comment documenting the expected PYTHONPATH/packaging if you choose absolute
imports or update to a proper package-relative import if this file is intended
to be part of the package.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 96-119: The tutorial currently runs filter_rows.py with the same
path for --input_csv and --output_csv (data/filtered_file.csv) for filter_type
values soft_solve and problem_codes, which risks corrupting the input if the
script fails mid-write; update the example commands to use distinct intermediate
output filenames (e.g., data/filtered_soft_solve.csv,
data/filtered_problem_codes.csv) or explicitly document the risk and recommend
using temporary files and atomic moves, referencing the filter_rows.py
invocation and the --input_csv/--output_csv parameters so readers can spot and
change the commands.
- Around line 233-234: Fix the typo in the log directory flags used in the
preparation commands: replace occurrences of
"--log_dir=/workspace/prepare-sft-data-indicence" and
"--log_dir=/workspace/prepare-test-data-indicence" with
"--log_dir=/workspace/prepare-sft-data-incidence" and
"--log_dir=/workspace/prepare-test-data-incidence" respectively so both commands
use "incidence" instead of "indicence".
- Around line 232-246: The flag --run_after=solution-generation references a
non-existent experiment and should be removed or replaced with a valid
experiment name; update the ns run_cmd invocation that contains
--run_after=solution-generation to either delete the --run_after argument or set
it to an existing experiment such as --run_after=gpt-oss-sdg-with-python or
--run_after=prepare-noc so the pipeline will correctly sequence (look for the ns
run_cmd block and the --run_after flag in the noc-reasoning-agent tutorial).

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 59-66: The broad try/except in coerce_to_str is unnecessary and
hides real errors; remove the try/except and directly return str(value) after
the pd.isna check so any genuine exceptions (e.g., a broken __str__) will
propagate; update the function coerce_to_str to simply check pd.isna(value) then
return str(value) without catching Exception.
- Around line 142-147: Remove the stray debug print that dumps
required_incidents to stdout: in the block that reads args.required_incidents
(the code that opens args.required_incidents, populates required_incidents, and
then filters df by id_col), delete the print(required_incidents) call (or
replace it with a debug-level logger call if you need it for troubleshooting) so
large lists are not printed to stdout before filtering.

---

Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 354-366: Update the "Compare Results" section to include concrete
expected accuracy numbers and dataset details: mention approximate baseline vs
fine-tuned close-code accuracy (e.g., "Baseline ~X% vs Fine-tuned ~Y% on the
sample dataset"), note any synonym matching behavior of
src/evaluation/problem_code_evaluation.py, and provide any special dataset prep
or inference args needed to reproduce the numbers; reference the example
commands that run src/evaluation/problem_code_evaluation.py on
outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl so
readers can validate their runs against the stated expected metrics.

---

Nitpick comments:
In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Line 82: Change the argparse declaration for --limit to accept integers by
adding type=int (and optionally default=None) so argparse validates numeric
input; then remove the manual cast int(args.limit) and use args.limit directly
where used (e.g., in the code that slices or limits rows), updating references
in this module (create_input_jsonl_from_incidents.py) to use the parsed integer
value instead of re-casting a string.
- Around line 112-118: The try/except around the datetime conversion is too
broad and should not catch all Exception types; update the block that converts
df[start_col] and df[end_col] and computes df["time_to_resolve"] (in
create_input_jsonl_from_incidents.py) to catch only ValueError and TypeError,
and ensure the logged message includes the exception type and message (instead
of a bare except Exception). Keep the fallback that sets df["time_to_resolve"] =
0 on those parse errors and preserve the existing log text/context while adding
the exception type information.
- Around line 69-75: The function extract_examples_by_problem_code should
declare types: change its signature to accept df: pandas.DataFrame
(pd.DataFrame) and num: int and return pd.DataFrame; update the signature of
extract_examples_by_problem_code to include these type hints and add the return
annotation -> pd.DataFrame so linters and type checkers know the expected types
(ensure pd is imported from pandas where the file already uses it).
- Around line 9-14: The try/except ImportError fall-back between relative and
absolute imports is fragile; replace the dual import block with a single,
explicit import style (e.g., use absolute imports or a consistent relative
import) for get_reasoning_process_for_fault_category, INCIDENT_ID_COLUMN, and
REQUIRED_COLUMNS so the module resolution is deterministic, and add a short
comment documenting the expected PYTHONPATH/packaging if you choose absolute
imports or update to a proper package-relative import if this file is intended
to be part of the package.

coderabbitai · 2026-02-19T18:05:37Z

docs/tutorials/posts/noc-reasoning-agent.md

+```shell
+# Remove auto-recovered incidents
+python src/filtering/filter_rows.py \
+    --input_csv data/categorized_incidents.csv \
+    --output_csv data/filtered_file.csv \
+    --filter_type auto
+
+# Keep only remotely-solvable incidents
+python src/filtering/filter_rows.py \
+    --input_csv data/filtered_file.csv \
+    --output_csv data/filtered_file.csv \
+    --filter_type soft_solve
+
+# Keep top 16 fault categories
+python src/filtering/filter_rows.py \
+    --input_csv data/filtered_file.csv \
+    --output_csv data/filtered_file.csv \
+    --filter_type problem_codes
+
+# Keep top 10 resolution methods
+python src/filtering/filter_rows.py \
+    --input_csv data/filtered_file.csv \
+    --output_csv data/finalized_dataset.csv \
+    --filter_type close_codes


⚠️ Potential issue | 🟡 Minor

In-place CSV overwrite may lose data on failure.

Lines 104–107 and 110–113 use the same file (data/filtered_file.csv) as both --input_csv and --output_csv. If the script fails mid-write, the input data is lost. Consider using distinct intermediate filenames or documenting this risk.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 96 - 119, The tutorial currently runs filter_rows.py with the same path for --input_csv and --output_csv (data/filtered_file.csv) for filter_type values soft_solve and problem_codes, which risks corrupting the input if the script fails mid-write; update the example commands to use distinct intermediate output filenames (e.g., data/filtered_soft_solve.csv, data/filtered_problem_codes.csv) or explicitly document the risk and recommend using temporary files and atomic moves, referencing the filter_rows.py invocation and the --input_csv/--output_csv parameters so readers can spot and change the commands.

coderabbitai · 2026-02-19T18:05:37Z

docs/tutorials/posts/noc-reasoning-agent.md

+ns run_cmd \
+    --log_dir=/workspace/prepare-sft-data-indicence \
+    --expname=prep-sft-data-inci \
+    --run_after=solution-generation \
+    --cluster=local \
+    'python -m nemo_skills.training.prepare_data \
+        --config-path /workspace/data \
+        --config-name noc_reasoning_sft \
+        input_files=/workspace/outputs/training_data_split.jsonl \
+        output_path=/workspace/outputs/sft-data-incidence.jsonl \
+        prompt_config=/workspace/data/prompts/prompt_incident.yaml \
+        tokenizer=Qwen/Qwen3-32B \
+        filters.remove_contaminated=false \
+        add_unlabeled=true \
+        filters.trim_solutions=false'


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, let's search for any references to "solution-generation" echo "=== Searching for 'solution-generation' references ===" rg "solution-generation" --type=md --type=yaml --type=py --type=txt -n

Repository: NVIDIA-NeMo/Skills

Length of output: 774

🏁 Script executed:

#!/bin/bash # Let's examine the file to see the context around the specified lines echo "=== Content around lines 232-246 (SFT data prep) ===" sed -n '225,250p' docs/tutorials/posts/noc-reasoning-agent.md echo "" echo "=== Content around lines 296-311 (test data prep) ===" sed -n '290,315p' docs/tutorials/posts/noc-reasoning-agent.md

Repository: NVIDIA-NeMo/Skills

Length of output: 2205

🏁 Script executed:

#!/bin/bash # Let's search for all --run_after references and experiment definitions echo "=== All --run_after references in the file ===" rg "run_after" docs/tutorials/posts/noc-reasoning-agent.md -n echo "" echo "=== All --expname references to understand defined experiments ===" rg "expname" docs/tutorials/posts/noc-reasoning-agent.md -n -A 1 -B 1

Repository: NVIDIA-NeMo/Skills

Length of output: 1003

Remove or update --run_after=solution-generation to reference a valid experiment.

The --run_after=solution-generation flag at lines 235 and 300 references an experiment that is never defined in this tutorial. The experiments actually defined are: prepare-noc, gpt-oss-sdg-with-python, prep-sft-data-inci, and training. Remove this flag or update it to reference a valid prior step (e.g., --run_after=gpt-oss-sdg-with-python or --run_after=prepare-noc), otherwise the workflow will fail or behave unexpectedly when users follow the steps sequentially.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 232 - 246, The flag --run_after=solution-generation references a non-existent experiment and should be removed or replaced with a valid experiment name; update the ns run_cmd invocation that contains --run_after=solution-generation to either delete the --run_after argument or set it to an existing experiment such as --run_after=gpt-oss-sdg-with-python or --run_after=prepare-noc so the pipeline will correctly sequence (look for the ns run_cmd block and the --run_after flag in the noc-reasoning-agent tutorial).

coderabbitai · 2026-02-19T18:05:37Z

docs/tutorials/posts/noc-reasoning-agent.md

+    --log_dir=/workspace/prepare-sft-data-indicence \
+    --expname=prep-sft-data-inci \


⚠️ Potential issue | 🟡 Minor

Typo: "indicence" → "incidence" in log directory paths.

Both the training and test data preparation commands use indicence instead of incidence:

Line 233: --log_dir=/workspace/prepare-sft-data-indicence

Line 298: --log_dir=/workspace/prepare-test-data-indicence

While these are just log directory names, users following the tutorial will create oddly named directories.

Proposed fix

- --log_dir=/workspace/prepare-sft-data-indicence \ + --log_dir=/workspace/prepare-sft-data-incidence \

- --log_dir=/workspace/prepare-test-data-indicence \ + --log_dir=/workspace/prepare-test-data-incidence \

Also applies to: 298-299

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 233 - 234, Fix the typo in the log directory flags used in the preparation commands: replace occurrences of "--log_dir=/workspace/prepare-sft-data-indicence" and "--log_dir=/workspace/prepare-test-data-indicence" with "--log_dir=/workspace/prepare-sft-data-incidence" and "--log_dir=/workspace/prepare-test-data-incidence" respectively so both commands use "incidence" instead of "indicence".

coderabbitai · 2026-02-19T18:05:37Z

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py

+def coerce_to_str(value) -> str:
+    if pd.isna(value):
+        return ""
+    # Ensure any non-string is converted to string safely
+    try:
+        return str(value)
+    except Exception:
+        return ""


🛠️ Refactor suggestion | 🟠 Major

Broad except Exception on str() is unnecessary.

str() will not raise for any standard Python type. This silently swallows real bugs (e.g., a custom __str__ that raises). Per the coding guideline: "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving." Ruff also flags this (BLE001).

Proposed fix

def coerce_to_str(value) -> str: if pd.isna(value): return "" - # Ensure any non-string is converted to string safely - try: - return str(value) - except Exception: - return "" + return str(value)

🧰 Tools

🪛 Ruff (0.15.1)

[warning] 65-65: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 59 - 66, The broad try/except in coerce_to_str is unnecessary and hides real errors; remove the try/except and directly return str(value) after the pd.isna check so any genuine exceptions (e.g., a broken __str__) will propagate; update the function coerce_to_str to simply check pd.isna(value) then return str(value) without catching Exception.

coderabbitai · 2026-02-19T18:05:37Z

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py

+    if args.required_incidents:
+        with open(args.required_incidents, "r") as f:
+            required_incidents = f.read().splitlines()
+        print(required_incidents)
+        id_col = INCIDENT_ID_COLUMN if INCIDENT_ID_COLUMN in df.columns else "number"
+        df = df[df[id_col].astype(str).isin(required_incidents)]


⚠️ Potential issue | 🟡 Minor

Debug print(required_incidents) left in — will dump a potentially large list to stdout.

Line 145 prints the entire contents of the required incidents file to stdout. This looks like a leftover debug statement and will pollute output for large files.

Proposed fix

if args.required_incidents: with open(args.required_incidents, "r") as f: required_incidents = f.read().splitlines() - print(required_incidents) id_col = INCIDENT_ID_COLUMN if INCIDENT_ID_COLUMN in df.columns else "number"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 142 - 147, Remove the stray debug print that dumps required_incidents to stdout: in the block that reads args.required_incidents (the code that opens args.required_incidents, populates required_incidents, and then filters df by id_col), delete the print(required_incidents) call (or replace it with a debug-level logger call if you need it for troubleshooting) so large lists are not printed to stdout before filtering.

- Move __init__.py creation and PYTHONPATH export into a dedicated "Configure the environment" subsection in the tutorial - Remove per-command PYTHONPATH=$PWD prefixes (now set once at top) - Add all 40 synthetic fault_category values to ALLOWED_PROBLEM_CODES in filter_rows.py and create_input_jsonl_from_incidents.py - Add matching reasoning process mappings in reasoning_processes.py Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Python 3.3+ namespace packages make __init__.py unnecessary. The try/except fallback in create_input_jsonl handles relative imports gracefully. Only PYTHONPATH is needed for all script imports. Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py (1)
6-78: ALLOWED_PROBLEM_CODES is duplicated across files — extract to a shared module to eliminate drift risk.

The exact combined list of legacy and synthetic problem codes is maintained independently in both filter_rows.py and create_input_jsonl_from_incidents.py (lines 17–89). The latter file explicitly notes "Keep in sync with filter_rows.py" (line 16), indicating manual synchronization is expected. Define once in a shared constants module (e.g., recipes/noc-reasoning-agent/scripts/constants.py) and import from both locations to eliminate the drift risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines 6
- 78, The combined allowed-problem-codes list is duplicated
(ALLOWED_PROBLEM_CODES, ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC); extract these constants into a single shared
module (e.g., a new constants.py) and export ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC and ALLOWED_PROBLEM_CODES from there, then
replace the local lists in both filter_rows.py and
create_input_jsonl_from_incidents.py with imports from the new module so both
files reference the single source of truth.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py`:
- Around line 106-111: Replace the broad except Exception handlers in the
CSV-reading blocks with except UnicodeDecodeError to only fallback to a
different encoding on decoding errors; update the try/except in filter_auto (and
the three other filter functions referenced) so they attempt
pd.read_csv(input_csv, encoding="latin1") and on UnicodeDecodeError retry
pd.read_csv(input_csv), letting FileNotFoundError/PermissionError and other
unexpected exceptions propagate. Ensure you modify the except clause in the
functions named filter_auto and the three other filter_… functions shown in the
diff.
- Around line 193-202: The current dispatch on args.filter_type silently ignores
unsupported values; add an else branch after the existing if/elif chain that
raises a ValueError (or argparse.ArgumentTypeError) with a clear message
including the invalid args.filter_type and the allowed options ("auto",
"soft_solve", "problem_codes", "close_codes"); update the code around the
filter_auto, filter_soft_solve, filter_problem_codes, and filter_close_codes
dispatch so unsupported values immediately raise this error instead of exiting
silently.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 123-125: The conditional currently uses truthiness for
args.examples_by_problem_code which treats 0 as falsy and skips extraction;
update the check in the block that calls extract_examples_by_problem_code so it
tests explicitly for None (e.g., if args.examples_by_problem_code is not None)
to allow zero to be a valid value, then call
extract_examples_by_problem_code(df, args.examples_by_problem_code) and print
the extracted count into df as before.

In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py`:
- Around line 200-251: Replace permissive .get() lookups with direct key access
for entries that must exist: change uses of
PROBLEM_CODE_REASONING_PROCESS.get("...","") inside DEFAULT_REASONING_PROCESS
and the SYNTHETIC_FAULT_CATEGORY_REASONING mapping to
PROBLEM_CODE_REASONING_PROCESS["..."] so typos raise KeyError and fail fast;
update every mapping entry (e.g., "power_ac_failure_recovery",
"ran_cell_site_down_recovery", "compute_vm_failure_recovery",
"transport_routing_flap_resolution", "signaling_routing_failure_recovery", etc.)
and DEFAULT_REASONING_PROCESS to use bracket indexing against
PROBLEM_CODE_REASONING_PROCESS.
- Around line 165-197: Several legacy SOP entries are empty (e.g., "VM is in not
ready state", "Link Down", "N2 Link Down" and others) causing
SYNTHETIC_FAULT_CATEGORY_REASONING to produce blank
problem_code_reasoning_process; update the mapping logic so that any synthetic
category that currently maps to one of those empty keys falls back to
DEFAULT_REASONING_PROCESS (the "Service-off" SOP) instead of the empty string,
and/or populate those specific keys with the DEFAULT_REASONING_PROCESS content;
modify references in SYNTHETIC_FAULT_CATEGORY_REASONING and ensure lookups for
"VM is in not ready state", "Link Down", "N2 Link Down" (and the other empty
keys listed) return DEFAULT_REASONING_PROCESS when the entry is empty or
missing.

---

Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 114-131: The docs show running src/filtering/filter_rows.py with
--input_csv and --output_csv pointing to the same path (data/filtered_file.csv),
which risks corrupting the input on write failures; update the tutorial commands
to use distinct intermediate output filenames (e.g.,
data/filtered_file_step1.csv, data/filtered_file_step2.csv) or explicitly
document the risk and recommend using a temp file and atomic rename, referencing
the script name filter_rows.py and the flags --input_csv and --output_csv so
readers know where to change the invocation.
- Around line 243-257: Fix the typo in the ns run_cmd --log_dir value (change
prepare-sft-data-indicence to prepare-sft-data-incidence) and update the
--run_after argument in the ns run_cmd invocation (remove the nonexistent
solution-generation reference or replace it with a valid prior step such as
gpt-oss-sdg-with-python); apply the same two fixes to the duplicate test-data
preparation ns run_cmd block that repeats the same --log_dir and --run_after
mistakes.
- Around line 365-378: Add concrete expected results and any dataset-specific
eval notes to the "Compare Results" section: after the two example commands that
call src/evaluation/problem_code_evaluation.py on outputs/agent_responses.jsonl
and outputs/baseline_agent_responses.jsonl, include approximate accuracy numbers
(e.g., "baseline: ~X% close-code accuracy, fine-tuned: ~Y%") and any special
evaluation flags or dataset prep used when generating those outputs (e.g.,
tokenization, synonym mappings, non-standard inference args). Reference the
script name problem_code_evaluation.py and the output files
outputs/agent_responses.jsonl and outputs/baseline_agent_responses.jsonl so
readers can validate their runs against the documented expectations.

In
`@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py`:
- Around line 16-78: ALLOWED_PROBLEM_CODES is duplicated here and in
filter_rows.py; extract that list into a single shared constant (e.g., export
ALLOWED_PROBLEM_CODES from a new module like shared/problem_codes or
utils/problem_codes) and update both create_input_jsonl_from_incidents.py and
filter_rows.py to import the constant instead of defining it locally; ensure the
new module exports the same name (ALLOWED_PROBLEM_CODES) and update any
references to the variable to use the imported symbol to keep behavior
unchanged.
- Around line 81-88: The coerce_to_str function currently wraps str(value) in a
try/except which silently swallows errors; remove the unnecessary try/except in
coerce_to_str so the function simply returns "" when pd.isna(value) is True and
otherwise returns str(value), allowing any unexpected exceptions from str() to
propagate; update the function body for coerce_to_str accordingly and keep the
function signature unchanged.
- Around line 164-169: Remove the leftover debug print by deleting the
print(required_incidents) call in the args.required_incidents handling block;
locate the block that reads required_incidents (variable required_incidents) and
filters df using id_col (uses INCIDENT_ID_COLUMN or "number") and simply keep
the file read + df = df[df[id_col].astype(str).isin(required_incidents)] logic
without printing the list to stdout.

---

Nitpick comments:
In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py`:
- Around line 6-78: The combined allowed-problem-codes list is duplicated
(ALLOWED_PROBLEM_CODES, ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC); extract these constants into a single shared
module (e.g., a new constants.py) and export ALLOWED_PROBLEM_CODES_LEGACY,
ALLOWED_PROBLEM_CODES_SYNTHETIC and ALLOWED_PROBLEM_CODES from there, then
replace the local lists in both filter_rows.py and
create_input_jsonl_from_incidents.py with imports from the new module so both
files reference the single source of truth.

coderabbitai · 2026-02-19T18:25:30Z

recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py

+def filter_auto(input_csv, output_csv):
+    """Filter incident data and save results to a new CSV."""
+    try:
+        df = pd.read_csv(input_csv, encoding="latin1")
+    except Exception:
+        df = pd.read_csv(input_csv)


🛠️ Refactor suggestion | 🟠 Major

Catch UnicodeDecodeError instead of bare Exception.

All four filter functions use except Exception when only an encoding mismatch is expected. This masks real errors like FileNotFoundError or PermissionError, violating the guideline to let unexpected exceptions fail with clear errors. The same pattern repeats at lines 131-134, 145-148, and 161-164.

Proposed fix (apply to all four functions)

- try: - df = pd.read_csv(input_csv, encoding="latin1") - except Exception: - df = pd.read_csv(input_csv) + try: + df = pd.read_csv(input_csv, encoding="latin1") + except UnicodeDecodeError: + df = pd.read_csv(input_csv)

As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def filter_auto(input_csv, output_csv):

"""Filter incident data and save results to a new CSV."""

try:

df = pd.read_csv(input_csv, encoding="latin1")

except Exception:

df = pd.read_csv(input_csv)

def filter_auto(input_csv, output_csv):

"""Filter incident data and save results to a new CSV."""

try:

df = pd.read_csv(input_csv, encoding="latin1")

except UnicodeDecodeError:

df = pd.read_csv(input_csv)

🧰 Tools

🪛 Ruff (0.15.1)

[warning] 110-110: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines 106 - 111, Replace the broad except Exception handlers in the CSV-reading blocks with except UnicodeDecodeError to only fallback to a different encoding on decoding errors; update the try/except in filter_auto (and the three other filter functions referenced) so they attempt pd.read_csv(input_csv, encoding="latin1") and on UnicodeDecodeError retry pd.read_csv(input_csv), letting FileNotFoundError/PermissionError and other unexpected exceptions propagate. Ensure you modify the except clause in the functions named filter_auto and the three other filter_… functions shown in the diff.

coderabbitai · 2026-02-19T18:25:31Z

recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py

+    # Run the filtering process
+    if args.filter_type == "auto":
+        filter_auto(args.input_csv, args.output_csv)
+    elif args.filter_type == "soft_solve":
+        filter_soft_solve(args.input_csv, args.output_csv)
+    elif args.filter_type == "problem_codes":
+        filter_problem_codes(args.input_csv, args.output_csv)
+    elif args.filter_type == "close_codes":
+        filter_close_codes(args.input_csv, args.output_csv)
+


⚠️ Potential issue | 🟠 Major

Invalid --filter_type values are silently ignored.

If a user passes an unsupported filter_type, the script exits successfully having done nothing. Add an else branch that raises an error.

Proposed fix

elif args.filter_type == "close_codes": filter_close_codes(args.input_csv, args.output_csv) + else: + parser.error(f"Unknown filter_type: {args.filter_type!r}. Choose from: auto, soft_solve, problem_codes, close_codes")

As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py` around lines 193 - 202, The current dispatch on args.filter_type silently ignores unsupported values; add an else branch after the existing if/elif chain that raises a ValueError (or argparse.ArgumentTypeError) with a clear message including the invalid args.filter_type and the allowed options ("auto", "soft_solve", "problem_codes", "close_codes"); update the code around the filter_auto, filter_soft_solve, filter_problem_codes, and filter_close_codes dispatch so unsupported values immediately raise this error instead of exiting silently.

coderabbitai · 2026-02-19T18:25:31Z

recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py

+    if args.examples_by_problem_code:
+        df = extract_examples_by_problem_code(df, args.examples_by_problem_code)
+        print(f"Extracted {len(df)} examples from the dataset.")


⚠️ Potential issue | 🟡 Minor

if args.examples_by_problem_code: is falsy when the value is 0.

If a user passes --examples_by_problem_code 0, the truthiness check skips extraction silently instead of producing an empty dataset. Use is not None to distinguish "not provided" from "zero."

Proposed fix

- if args.examples_by_problem_code: + if args.examples_by_problem_code is not None:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py` around lines 123 - 125, The conditional currently uses truthiness for args.examples_by_problem_code which treats 0 as falsy and skips extraction; update the check in the block that calls extract_examples_by_problem_code so it tests explicitly for None (e.g., if args.examples_by_problem_code is not None) to allow zero to be a valid value, then call extract_examples_by_problem_code(df, args.examples_by_problem_code) and print the extracted count into df as before.

coderabbitai · 2026-02-19T18:25:31Z

recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py

+    "Site Not Scrolling": """
+
+    """,
+    "Sleepy Cell": """
+
+    """,
+    "VM is in not ready state": """
+
+    """,
+    "Prach 0": """
+
+    """,
+    "N2 Link Down": """
+
+    """,
+    "ueconmgr pod restarted": """
+
+    """,
+    "CSR Not Reachable": """
+
+    """,
+    "Circuit Down": """
+
+    """,
+    "Link Down": """
+
+    """,
+    "GPS Sync": """
+
+    """,
+    "MTA Alert": """
+
+    """,


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and locate the relevant dictionaries cd recipes/noc-reasoning-agent/scripts/utils/ 2>/dev/null || cd . find . -name "reasoning_processes.py" -type f

Repository: NVIDIA-NeMo/Skills

Length of output: 86

🏁 Script executed:

# Read the file to find PROBLEM_CODE_REASONING_PROCESS and SYNTHETIC_FAULT_CATEGORY_REASONING cat -n recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py | head -250

Repository: NVIDIA-NeMo/Skills

Length of output: 21411

🏁 Script executed:

# Get total line count to understand file size wc -l recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py

Repository: NVIDIA-NeMo/Skills

Length of output: 130

🏁 Script executed:

# Continue reading the file to see remaining SYNTHETIC_FAULT_CATEGORY_REASONING mappings tail -n +250 recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py | head -30

Repository: NVIDIA-NeMo/Skills

Length of output: 823

Populate empty legacy SOPs or map synthetic categories to a non-empty fallback.

Lines 165–197 contain 11 empty reasoning processes (only whitespace): "Site Not Scrolling", "Sleepy Cell", "VM is in not ready state", "Prach 0", "N2 Link Down", "ueconmgr pod restarted", "CSR Not Reachable", "Circuit Down", "Link Down", "GPS Sync", and "MTA Alert".

SYNTHETIC_FAULT_CATEGORY_REASONING actively maps synthetic categories to these empty entries:

8 compute categories → "VM is in not ready state" (empty)

4 transport categories → "Link Down" (empty)

4 signaling categories → "N2 Link Down" (empty)

This produces blank problem_code_reasoning_process values in generated training data, degrading quality. Use DEFAULT_REASONING_PROCESS (which maps to the populated "Service-off" SOP) as a fallback for these mappings.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py` around lines 165 - 197, Several legacy SOP entries are empty (e.g., "VM is in not ready state", "Link Down", "N2 Link Down" and others) causing SYNTHETIC_FAULT_CATEGORY_REASONING to produce blank problem_code_reasoning_process; update the mapping logic so that any synthetic category that currently maps to one of those empty keys falls back to DEFAULT_REASONING_PROCESS (the "Service-off" SOP) instead of the empty string, and/or populate those specific keys with the DEFAULT_REASONING_PROCESS content; modify references in SYNTHETIC_FAULT_CATEGORY_REASONING and ensure lookups for "VM is in not ready state", "Link Down", "N2 Link Down" (and the other empty keys listed) return DEFAULT_REASONING_PROCESS when the entry is empty or missing.

coderabbitai · 2026-02-19T18:25:31Z

recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py

+# Default reasoning process when fault_category (synthetic workflow ID) is not in PROBLEM_CODE_REASONING_PROCESS.
+DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "")
+
+# Synthetic data uses fault_category = workflow IDs (e.g. power_ac_failure_recovery). Map them to a reasoning process.
+SYNTHETIC_FAULT_CATEGORY_REASONING = {
+    # Power / Environment → mapped to Service-off SOP (power-related outage workflow)
+    "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "env_battery_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "env_cabinet_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    # RAN → mapped to closest legacy SOP
+    "ran_software_upgrade_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""),
+    "ran_cell_site_down_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Node Down", ""),
+    "ran_interference_mitigation": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_speed_complaint_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_voice_quality_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_sector_outage_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Disabled Cells", ""),
+    "ran_prb_availability_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_cell_overshooting_correction": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_rru_communication_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""),
+    "ran_dropped_calls_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_parameter_correction": PROBLEM_CODE_REASONING_PROCESS.get("Disabled Cells", ""),
+    "ran_antenna_tilt_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_vswr_alarm_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_handover_failure_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_backhaul_degradation_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""),
+    "ran_cell_congestion_management": PROBLEM_CODE_REASONING_PROCESS.get("Degraded Prach", ""),
+    "ran_device_issue_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Offline / Unreachable", ""),
+    # Compute → mapped to VM/container legacy SOPs
+    "compute_vm_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_container_crash_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_orchestrator_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_image_pull_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_k8s_node_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_storage_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_cnf_pod_recovery": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    "compute_resource_exhaustion_resolution": PROBLEM_CODE_REASONING_PROCESS.get("VM is in not ready state", ""),
+    # Transport → mapped to Link Down / Circuit Down SOPs
+    "transport_routing_flap_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""),
+    "transport_microwave_degradation_response": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""),
+    "transport_interface_errors_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""),
+    "transport_packet_loss_resolution": PROBLEM_CODE_REASONING_PROCESS.get("Link Down", ""),
+    # Signaling → mapped to N2 Link Down SOP
+    "signaling_routing_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""),
+    "signaling_delay_resolution": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""),
+    "signaling_s1_n2_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""),
+    "signaling_sip_registration_recovery": PROBLEM_CODE_REASONING_PROCESS.get("N2 Link Down", ""),


🛠️ Refactor suggestion | 🟠 Major

Use direct dictionary access instead of .get() for keys that must exist.

Every key referenced in SYNTHETIC_FAULT_CATEGORY_REASONING (and line 201) is defined in PROBLEM_CODE_REASONING_PROCESS above. Using .get(key, "") silently returns an empty string if a key is mistyped, hiding bugs. Use PROBLEM_CODE_REASONING_PROCESS[key] to fail fast on typos.

Proposed fix (showing first few lines; apply to all)

-DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "") +DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS["Service-off"] SYNTHETIC_FAULT_CATEGORY_REASONING = { - "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), + "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], # ... apply to all remaining entries

As per coding guidelines, "Do not use .get() for accessing dictionary keys if the code expects them to be present; use direct dictionary access dict[key] instead to allow proper error handling and fail fast with clear errors."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py` around lines 200 - 251, Replace permissive .get() lookups with direct key access for entries that must exist: change uses of PROBLEM_CODE_REASONING_PROCESS.get("...","") inside DEFAULT_REASONING_PROCESS and the SYNTHETIC_FAULT_CATEGORY_REASONING mapping to PROBLEM_CODE_REASONING_PROCESS["..."] so typos raise KeyError and fail fast; update every mapping entry (e.g., "power_ac_failure_recovery", "ran_cell_site_down_recovery", "compute_vm_failure_recovery", "transport_routing_flap_resolution", "signaling_routing_failure_recovery", etc.) and DEFAULT_REASONING_PROCESS to use bracket indexing against PROBLEM_CODE_REASONING_PROCESS.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

docs/tutorials/posts/noc-reasoning-agent.md (1)
2-2: Stale date in front matter — should reflect 2026.

The post date 2025-09-05 predates the PR creation date (2026-02-13) by ~5 months. This will affect any ordering or display logic on the docs site.
✏️ Proposed fix
-date: 2025-09-05
+date: 2026-02-13
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/posts/noc-reasoning-agent.md` at line 2, Update the YAML
front-matter `date:` field (currently "2025-09-05") in the document's top matter
to a 2026 date consistent with the PR (e.g., "2026-02-13") so the post
ordering/display logic reflects the correct publication time.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 344-348: The --weights_dir argument passed to the
create_agent_with_tools_batch.py invocation uses a relative path which will fail
if the current working directory isn't the project workspace; update the CLI in
noc-reasoning-agent.md so the --weights_dir points to the absolute model
location used when saving (or construct it from the workspace variable), e.g.
replace the relative "training/qwen3-32b-improved-hf" value with the absolute
path that was used when saving the model or a resolved ${WORKSPACE}/training/...
equivalent so create_agent_with_tools_batch.py can find the weights regardless
of caller CWD.
- Around line 154-167: The two run commands reuse the same --expname (currently
"--expname=gpt-oss-sdg-with-python"), which will cause experiment-tracking
collisions between Phase 1 and Phase 2; update each command's --expname so they
are unique (e.g., "--expname=gpt-oss-sdg-with-python-phase1" and
"--expname=gpt-oss-sdg-with-python-phase2") wherever the flag appears in the
tutorial commands to ensure separate job names, log dirs, and resume behavior.

---

Duplicate comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Around line 240-241: Fix the typo "indicence" to "incidence" in the command
flag string shown in the diff; update the value passed to --log_dir (and any
matching occurrence, e.g., the --expname value "prep-sft-data-inci" if intended)
so the directory/experiment name uses "incidence" instead of "indicence" to
ensure consistent naming.
- Around line 239-253: The --run_after flag in the ns run_cmd invocation
references a non-existent experiment name "solution-generation"; update the
--run_after value (or remove the flag) to match an actual experiment ID defined
in your workflow/namespace so the job dependency is valid—locate the ns run_cmd
call in the docs (the block using ns run_cmd with --expname=prep-sft-data-inci)
and replace "solution-generation" with the correct experiment name or omit
--run_after if no dependency is needed.
- Around line 304-319: Fix the two issues in the ns run_cmd invocation: change
the --run_after value from "solution-generation" to the correct job name
"solution_generation" (underscore) so the pipeline will trigger at the right
step, and correct the typo in --log_dir and --expname where "indicence" appears
by renaming it to "incidence" (e.g.,
--log_dir=/workspace/prepare-test-data-incidence and
--expname=prep-test-data-inci -> ensure any matching identifiers use
"incidence"). Use the flags --run_after and --log_dir in the provided command to
locate and update the values.
- Around line 295-374: The Evaluation section is missing expected results and
dataset/benchmark details; update docs/tutorials/posts/noc-reasoning-agent.md
(under the "Evaluation" or after "Compare Results") to include: a short
"Expected results" subsection with example accuracy numbers for the fine-tuned
model (training/qwen3-32b-improved-hf) and baseline (Qwen/Qwen3-32B), exact
commands already shown (e.g. create_agent_with_tools_batch.py and
problem_code_evaluation.py) plus the metric names reported (close-code accuracy)
and example output lines, and a "Dataset notes" subsection describing any
special preparation flags used (prepare_data flags like
filters.remove_contaminated, add_unlabeled, filters.trim_solutions) and
non-standard inference args so readers can reproduce the benchmark.
- Around line 103-126: The commands in the tutorial overwrite the same CSV file
in-place which risks data loss; update the examples that call
src/filtering/filter_rows.py to avoid using the same path for --input_csv and
--output_csv (use distinct intermediate filenames like data/filtered_step1.csv,
data/filtered_step2.csv, etc.), or modify the script (functionality in
src/filtering/filter_rows.py that parses --input_csv and --output_csv) to
support a safe --inplace flag that writes to a temp file and atomically replaces
the original; ensure all example invocations reference unique output filenames
or document the new --inplace behavior.

---

Nitpick comments:
In `@docs/tutorials/posts/noc-reasoning-agent.md`:
- Line 2: Update the YAML front-matter `date:` field (currently "2025-09-05") in
the document's top matter to a 2026 date consistent with the PR (e.g.,
"2026-02-13") so the post ordering/display logic reflects the correct
publication time.

coderabbitai · 2026-02-19T18:31:44Z

docs/tutorials/posts/noc-reasoning-agent.md

+    --expname=gpt-oss-sdg-with-python \
+    --model=openai/gpt-oss-120b \
+    --server_gpus=8 \
+    --output_dir=/workspace/outputs/sdg/ \
+    --input_file=/workspace/outputs/input_incident.jsonl \
+    ++prompt_config=/workspace/data/prompts/formatting_prompt.yaml \
+    ++inference.tokens_to_generate=8192 \
+    ++inference.temperature=0.6 \
+    ++chat_template_kwargs.reasoning_effort=medium \
+    ++inference.endpoint_type=text \
+    ++code_execution=false \
+    ++server.enable_soft_fail=True \
+    ++skip_filled=False --rerun_done
+```


⚠️ Potential issue | 🟡 Minor

Both generation phases share the same --expname, which can cause experiment-tracking conflicts.

Phase 1 (Line 154) and Phase 2 (Line 191) both use --expname=gpt-oss-sdg-with-python. Nemo-Skills uses the experiment name for job naming, log directories, and resume logic. Reusing the same name for a separate phase risks collisions in the scheduler, ambiguous log attribution, or the second run being silently treated as a resume of the first. Use distinct names for each phase.

✏️ Proposed fix

- --expname=gpt-oss-sdg-with-python \ # Phase 1 (line 154) + --expname=gpt-oss-sdg-phase1 \

- --expname=gpt-oss-sdg-with-python \ # Phase 2 (line 191) + --expname=gpt-oss-sdg-phase2 \

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 154 - 167, The two run commands reuse the same --expname (currently "--expname=gpt-oss-sdg-with-python"), which will cause experiment-tracking collisions between Phase 1 and Phase 2; update each command's --expname so they are unique (e.g., "--expname=gpt-oss-sdg-with-python-phase1" and "--expname=gpt-oss-sdg-with-python-phase2") wherever the flag appears in the tutorial commands to ensure separate job names, log dirs, and resume behavior.

coderabbitai · 2026-02-19T18:31:44Z

docs/tutorials/posts/noc-reasoning-agent.md

+python src/create_agent_with_tools_batch.py \
+    --input outputs/final_agent_input.jsonl \
+    --output outputs/agent_responses.jsonl \
+    --weights_dir training/qwen3-32b-improved-hf
+```


⚠️ Potential issue | 🟡 Minor

--weights_dir uses a relative path — will break if CWD ≠ /workspace.

Line 347 passes training/qwen3-32b-improved-hf as a relative path, but the model is saved to the absolute path /workspace/training/qwen3-32b-improved-hf (line 272). The create_agent_with_tools_batch.py script is invoked as a plain python command (not via ns run_cmd), so the effective CWD is the caller's shell, not /workspace. If the user is not cd'd into /workspace, the path will not resolve.

✏️ Proposed fix

python src/create_agent_with_tools_batch.py \ --input outputs/final_agent_input.jsonl \ --output outputs/agent_responses.jsonl \ - --weights_dir training/qwen3-32b-improved-hf + --weights_dir /workspace/training/qwen3-32b-improved-hf

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

python src/create_agent_with_tools_batch.py \

--input outputs/final_agent_input.jsonl \

--output outputs/agent_responses.jsonl \

--weights_dir training/qwen3-32b-improved-hf

```

python src/create_agent_with_tools_batch.py \

--input outputs/final_agent_input.jsonl \

--output outputs/agent_responses.jsonl \

--weights_dir /workspace/training/qwen3-32b-improved-hf

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/posts/noc-reasoning-agent.md` around lines 344 - 348, The --weights_dir argument passed to the create_agent_with_tools_batch.py invocation uses a relative path which will fail if the current working directory isn't the project workspace; update the CLI in noc-reasoning-agent.md so the --weights_dir points to the absolute model location used when saving (or construct it from the workspace variable), e.g. replace the relative "training/qwen3-32b-improved-hf" value with the absolute path that was used when saving the model or a resolved ${WORKSPACE}/training/... equivalent so create_agent_with_tools_batch.py can find the weights regardless of caller CWD.

acanaveras added 2 commits February 13, 2026 03:28

Add NoC reasoning agent tutorial

529f1d5

Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com>

Add sample synthetic incidents data file

c3a7a65

Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com>

greptile-apps bot reviewed Feb 13, 2026

View reviewed changes

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

gwarmstrong reviewed Feb 13, 2026

View reviewed changes

coderabbitai bot reviewed Feb 17, 2026

View reviewed changes

Fix import errors, BOM encoding, and groupby column loss in recipe sc…

d11100c

…ripts Signed-off-by: Amparo Canaveras <acanaveras@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

acanaveras force-pushed the add-noc-reasoning-tutorial branch from 91a3fe6 to d11100c Compare February 19, 2026 18:01

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

acanaveras and others added 2 commits February 19, 2026 18:19

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

-site/
-site/
-site/
-site/
-site/
+site/

		model = AutoModelForCausalLM.from_pretrained(
		args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True

	parser.add_argument("--output_file", help="Path to output")
	parser.add_argument("--output_file", required=True, help="Path to output JSONL")

               .codex
               .idea
+              site/

		```

		When prompted during `ns setup`, define a working folder as `/workspace` (e.g. mount `/your/project/dir:/workspace`). This folder will be used in all subsequent commands. For more details, see the [Nemo-Skills configs](https://nvidia-nemo.github.io/Skills/basics/cluster-configs/) documentation.

		# Teaching a Model to Reason Over Telecom Network Incidents

		This tutorial walks you through a complete pipeline for fine-tuning a reasoning model that can autonomously diagnose and resolve telecom network incidents. Using [Nemo-Skills](https://nvidia-nemo.github.io/Skills/) together with a [NoC Reasoning Agent](https://github.com/aiden200/NoC_Reasoning_Agent), we will take [Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) and teach it to perform step-by-step root-cause analysis with tool-calling — the same workflow a human NOC (Network Operations Center) engineer follows today.

		--log_dir=/workspace/prepare-sft-data-indicence \
		--expname=prep-sft-data-inci \

Comments

Conversation

acanaveras commented Feb 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong left a comment

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwarmstrong Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

acanaveras commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2026 •

edited

Loading

gwarmstrong Feb 13, 2026 •

edited

Loading

gwarmstrong Feb 13, 2026 •

edited

Loading