Skip to content

Add a skill for evolving skills or adding new skills based on interaction#952

Open
rgsl888prabhu wants to merge 11 commits intoNVIDIA:mainfrom
rgsl888prabhu:add_skill_evolution
Open

Add a skill for evolving skills or adding new skills based on interaction#952
rgsl888prabhu wants to merge 11 commits intoNVIDIA:mainfrom
rgsl888prabhu:add_skill_evolution

Conversation

@rgsl888prabhu
Copy link
Collaborator

Description

The new skill will try to propose a change or new skills so that it can capture generic patterns to help next developer

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 11, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rgsl888prabhu rgsl888prabhu self-assigned this Mar 11, 2026
@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Mar 11, 2026
@anandhkb anandhkb added this to the 26.04 milestone Mar 12, 2026
@rgsl888prabhu rgsl888prabhu marked this pull request as ready for review March 12, 2026 21:23
@rgsl888prabhu rgsl888prabhu requested review from a team as code owners March 12, 2026 21:23
@rgsl888prabhu rgsl888prabhu requested a review from tmckayus March 12, 2026 21:23
@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 54d0eb1f-029d-417a-a51d-0fe5f00b0000

📥 Commits

Reviewing files that changed from the base of the PR and between e53e5c4 and 6c19aeb.

📒 Files selected for processing (3)
  • skills/cuopt-user-rules/SKILL.md
  • skills/lp-milp-formulation/SKILL.md
  • skills/skill-evolution/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • skills/skill-evolution/SKILL.md
  • skills/cuopt-user-rules/SKILL.md
  • skills/lp-milp-formulation/SKILL.md

📝 Walkthrough

Walkthrough

Adds a Skill Evolution framework: registers a new plugin and rule, adds comprehensive skill-evolution documentation and workflow, updates agent docs and validation docs, and inserts evolution hooks into existing skills and LP/MILP guidance.

Changes

Cohort / File(s) Summary
Plugin & marketplace
.claude-plugin/marketplace.json
Adds new skill-evolution plugin entry pointing to ./skills/skill-evolution and enabled as always-active. Review plugin metadata and path.
Cursor rule
.cursor/rules/skill-evolution.mdc
Adds a new rule that triggers skill evolution after non-trivial problem solves; marked alwaysApply. Check rule metadata and invocation conditions.
Agent docs / redirect
.claude/CLAUDE.md, agents/AGENTS.md
Updates referenced agents path in .claude/CLAUDE.md and adds agents/AGENTS.md as a redirect to the top-level AGENTS.md. Verify relative links resolve.
Top-level agent docs
AGENTS.md
Expands agent documentation with skill evolution workflows, post-correction hook, and new rules entry. Review procedural guidance and mandatory hook wording.
CI docs
ci/README.md
Adds "Skill validation" documentation describing three existing skill tests and invocation instructions. Validate references to test scripts and locations.
Skill assets — new framework
skills/skill-evolution/SKILL.md
Adds full Skill Evolution framework: triggers, three-phase process (learning/inference/offline reflection), placement rules, proposal formats, provenance tagging, security rules, distillation checklist, and CI validation steps. Review governance, tagging markers, and CI requirements.
Skill assets — updates / hooks
skills/cuopt-user-rules/SKILL.md, skills/lp-milp-formulation/SKILL.md
Injects a mandatory post-correction check into cuOPT user rules and adds a Goal programming (preemptive/lexicographic) section to LP/MILP formulation skill. Check consistency with existing skill conventions and integrality guidance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new skill for skill evolution to propose changes and new skills based on interactions.
Description check ✅ Passed The description relates to the changeset by explaining the purpose of the new skill to propose changes and capture generic patterns, though it is brief and lacks technical detail.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can enable review details to help with troubleshooting, context usage and more.

Enable the reviews.review_details setting to include review details such as the model used, the time taken for each step and more in the review comments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/skill-evolution/SKILL.md`:
- Line 84: Add language identifiers to the two fenced code blocks in SKILL.md
that currently lack them: locate the fence under the "Skill update proposal:"
block and the fence under the "Skill insight (unscored):" block and change the
opening triple backticks from ``` to ```text so both code blocks are labeled
(for example, use the "text" tag) to satisfy markdownlint MD040.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ddc77017-3ae3-4df1-827c-9bbec706998f

📥 Commits

Reviewing files that changed from the base of the PR and between d531ad1 and e53e5c4.

📒 Files selected for processing (10)
  • .claude-plugin/marketplace.json
  • .claude/CLAUDE.md
  • .cursor/rules/skill-evolution.mdc
  • AGENTS.md
  • agents/AGENTS.md
  • agents/AGENTS.md
  • ci/README.md
  • skills/cuopt-user-rules/SKILL.md
  • skills/lp-milp-formulation/SKILL.md
  • skills/skill-evolution/SKILL.md


The user may approve, decline, or defer for offline reflection.

## Phase 3: Offline reflection
Copy link
Contributor

@hlinsen hlinsen Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions:

  1. For training the agent should have a clear context for each new prompt.
  2. The agent has seen the ground truths and would be jumping to inference without clearing context. I think we need state it explicitly and check that it works otherwise it may not be possible to do in the same active session.
  3. I think we need an evaluate.py that sets up the train pipeline. I don't think it can go over a dataset and understand what is train and test with the skill only. You can take a look at: https://github.com/karpathy/autoresearch/blob/master/prepare.py
  4. Should we provide a parser.py? The data is either in json or csv for outbound skill refinement: https://github.com/NVIDIA/cuopt-examples/tree/main/cuopt-agent/cuopt_agent/data/max_supply_what_ifs/eval

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently working on eval part where I will try to evolve skill based on industry or dataset and then run again with updated ones. And also run this against the optmath and other dataset to see if it provides better performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hlinsen I agree, evaluation and evolution script needs to be separate. There is a on going discussion on how to tackle this, may be we can work on this as follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants