Skip to content

Conversation

@lbliii
Copy link
Contributor

@lbliii lbliii commented Jan 27, 2026

No description provided.

@lbliii lbliii self-assigned this Jan 27, 2026
@lbliii lbliii changed the title Llane/integrations content ecosystem pg verbiage update Jan 27, 2026
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
@lbliii lbliii force-pushed the llane/integrations-content branch from 6fa4689 to 71f0756 Compare January 30, 2026 21:30
Signed-off-by: Lawrence Lane <[email protected]>
@lbliii lbliii requested review from bxyu-nvidia, cwing-nvidia and heatherlxd and removed request for bxyu-nvidia February 2, 2026 19:17
@lbliii lbliii requested a review from cmunley1 February 6, 2026 20:15
|-----------|--------|-------------|
| [NeMo RL](https://github.com/NVIDIA-NeMo/RL) | ✅ Supported | NVIDIA's scalable post-training library with GRPO, DPO, SFT |
| [Unsloth](https://github.com/unslothai/unsloth) | ✅ Supported | Fast fine-tuning framework with memory optimization |
| [veRL](https://github.com/volcengine/verl) | 🔜 In Progress | Volcano Engine's scalable RL framework |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think verl is in progres but maybe someone is working on it?

and i think we can change TRL to say supported now, we are just fixing a minor last minute change, and working on additional docs e.g. sample reward/step or a potential blog post.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's been in progress for a while but paused, we need to resume this effort and support in the next release

| Library | Status | Description |
|---------|--------|-------------|
| [reasoning-gym](https://github.com/open-thought/reasoning-gym) | ✅ Supported | Procedurally generated reasoning tasks (see `reasoning_gym` resource server) |
| [Aviary](https://github.com/Future-House/aviary) | ✅ Supported | Multi-environment framework for tool-using agents (see `aviary` resource server) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be worth saying its openai gymnasium compatible (but we should double confirm that statement)

Prime intellect - the library is named verifiers, or environments hub, not prime intelelct itself, imo

browsergym - not sure if anyone is working on this? @cwing-nvidia ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to verifiers in latest commits. browser gym integration is being worked on by Marc Cuevas

| Name | Demonstrates | Config | README |
| ------------------ | ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| Multi Step | Multi-step tool calling | <a href='resources_servers/example_multi_step/configs/example_multi_step.yaml'>example_multi_step.yaml</a> | <a href='resources_servers/example_multi_step/README.md'>README</a> |
| Reasoning Gym | External environment library integration | <a href='resources_servers/reasoning_gym/configs/reasoning_gym.yaml'>reasoning_gym.yaml</a> | <a href='resources_servers/reasoning_gym/README.md'>README</a> |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought these dont go in readme because they dont have hf dataset link, i thought this readme table was built automatically based on that somehow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want all environments to be discoverable from the README

| Resource Server | Domain | Dataset | Description | Value | Config | Train | Validation | License |
| -------------------------- | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | ----- | ---------- | --------------------------------------------------------- |
| Aviary (GSM8K) | agent | <a href='https://arxiv.org/abs/2110.14168'>GSM8K</a> | Grade school math with calculator tool via Aviary integration | Improve math reasoning with tool use | <a href='resources_servers/aviary/configs/gsm8k_aviary.yaml'>config</a> | ✓ | - | MIT |
| Aviary (HotPotQA) | agent | <a href='https://aclanthology.org/D18-1259/'>HotPotQA</a> | Multi-hop question answering via Aviary integration | Improve multi-hop reasoning capabilities | <a href='resources_servers/aviary/configs/hotpotqa_aviary.yaml'>config</a> | ✓ | - | Apache 2.0 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we starting to enumerate multiple datasets / env implementation in the readme now too? we should do same for math for example too then? @bxyu-nvidia

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be consistent

}
```

Any framework that can read this format can use NeMo Gym rollouts—no native integration required. The following frameworks have documented patterns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its more complex than this. dont we already have a training fw integration guide with varios requirements? e.g. async openai compatible, retokenization correction, etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. I removed this in later comments and linked to the guide

Simplify intro to emphasize goal of supporting broad set of RL training
frameworks and environment libraries. Add contribution invite with link
to issue template. Remove unnecessary tip box for new users.

Signed-off-by: Chris Wing <[email protected]>
- Move model recipes (Nemotron Nano, Super) to new docs/model-recipes/ section
- Simplify training framework integrations list in ecosystem page
- Rename "Unsloth Training" to "Unsloth" for consistency
- Update toctree to add Model Recipes section after Training Tutorials

Signed-off-by: Chris Wing <[email protected]>
- Remove verl.md and nemo-customizer.md pages (not ready yet)
- Reorganize training-tutorials/index.md with cleaner card layout
- Add OpenRLHF card linking to external integration
- Mark VeRL and NeMo Customizer as "Coming soon" with in-progress badges
- Remove card descriptions for consistency, add SFT & DPO section
- Reorder cards to match ecosystem page

Signed-off-by: Chris Wing <[email protected]>
- Rename page to "Agentic RL Ecosystem"
- Simplify training framework list with specific tutorial descriptions
- Condense environment library integrations with README link
- Reframe NeMo libraries section as "related tools for your workflow"
- Remove redundant sections (community, building custom environments)
- Update RL framework integration guide to link to training tutorials index

Signed-off-by: Chris Wing <[email protected]>
- Move nemo-rl-grpo/, unsloth-training.md, offline-training-w-rollouts.md
  from tutorials/ to training-tutorials/
- Update all cross-references across docs

Signed-off-by: Chris Wing <[email protected]>
- **{doc}`NeMo RL <../training-tutorials/nemo-rl-grpo/index>`** - GRPO training to improve multi-step tool calling on the Workplace Assistant environment
- **[OpenRLHF](https://github.com/OpenRLHF/OpenRLHF/blob/main/examples/python/agent_func_nemogym_executor.py)** - example agent executor for RL training
- **{doc}`TRL <../training-tutorials/trl>`** - GRPO training on Workplace Assistant and Reasoning Gym environments
- **{doc}`Unsloth <../training-tutorials/unsloth-training>`** - GRPO training on Sudoku environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also currently a multi-environment notebook that does instruction following and reasoning gym https://github.com/unslothai/notebooks/blob/main/nb/NeMo-Gym-Multi-Environment.ipynb if you want to mention. fine with me either way


Depending on your workflow, you may also find these libraries useful:

| Library | Purpose |
Copy link
Contributor

@cmunley1 cmunley1 Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would add nemo skills as there is some work ongoing between the two

**Common Requirements**:

- NeMo RL v0.4.0+ installed ([setup instructions](../tutorials/nemo-rl-grpo/setup))
- NeMo RL v0.4.0+ installed ([setup instructions](../training-tutorials/nemo-rl-grpo/setup))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v5 is released right, should we update to that?

| DPO | ✅ Stable | 🔜 Planned |
| ORPO | ✅ Stable | 🔜 Planned |
| GRPO | ❌ Not in TRL | ✅ Use {doc}`NeMo RL <../tutorials/nemo-rl-grpo/index>` |
| GRPO | ❌ Not in TRL | ✅ Use {doc}`NeMo RL <nemo-rl-grpo/index>` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Actually, the only algorithm that will be supported in TRL / NeMo Gym is GRPO - I dont think PPO DPO etc will work as of now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm i guess i see that this is the old docs stub and the other TRL docs PR will overwrite this, nvm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants