Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

NeMo Gym is a library for building reinforcement learning (RL) training environments for large language models (LLMs). It provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework.

NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/), NVIDIA’s GPU-accelerated platform for building and training generative AI models.
NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/). For details on how NeMo Gym fits within the NeMo ecosystem and integrates with other RL frameworks, see the [Ecosystem](https://docs.nvidia.com/nemo/gym/latest/about/ecosystem.html) documentation.


## 🏆 Why NeMo Gym?
Expand All @@ -16,6 +16,34 @@ NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/n
> [!IMPORTANT]
> NeMo Gym is currently in early development. You should expect evolving APIs, incomplete documentation, and occasional bugs. We welcome contributions and feedback - for any changes, please open an issue first to kick off discussion!

## 🔗 Ecosystem Integrations

NeMo Gym is designed to integrate seamlessly with the broader RL ecosystem. For detailed documentation, see the [Ecosystem](https://docs.nvidia.com/nemo/gym/latest/about/ecosystem.html) page.

### Training Frameworks

NeMo Gym provides rollout collection infrastructure that integrates with various RL training frameworks:

| Framework | Status | Description |
|-----------|--------|-------------|
| [NeMo RL](https://github.com/NVIDIA-NeMo/RL) | ✅ Supported | NVIDIA's scalable post-training library with GRPO, DPO, SFT |
| [Unsloth](https://github.com/unslothai/unsloth) | ✅ Supported | Fast fine-tuning framework with memory optimization |
| [TRL](https://github.com/huggingface/trl) | ✅ Supported | Hugging Face Transformer Reinforcement Learning |
| [veRL](https://github.com/volcengine/verl) | 🔜 In Progress | Volcano Engine's scalable RL framework |

### Environment Libraries

NeMo Gym integrates with environment libraries for diverse training scenarios. All integrations are compatible with OpenAI Gymnasium standards.

| Library | Status | Description |
|---------|--------|-------------|
| [reasoning-gym](https://github.com/open-thought/reasoning-gym) | ✅ Supported | Procedurally generated reasoning tasks (see `reasoning_gym` resource server) |
| [Aviary](https://github.com/Future-House/aviary) | ✅ Supported | Multi-environment framework for tool-using agents (see `aviary` resource server) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be worth saying its openai gymnasium compatible (but we should double confirm that statement)

Prime intellect - the library is named verifiers, or environments hub, not prime intelelct itself, imo

browsergym - not sure if anyone is working on this? @cwing-nvidia ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to verifiers in latest commits. browser gym integration is being worked on by Marc Cuevas

| [Verifiers](https://github.com/PrimeIntellect-ai/verifiers) | 🔜 In Progress | Environment hub for coding, data & ML, science & reasoning, tool use and more |
| [BrowserGym](https://github.com/ServiceNow/BrowserGym) | 🔜 In Progress | Web browsing and automation environments |

> 💡 **Want to add an integration?** We welcome contributions! See our [Contributing Guide](https://docs.nvidia.com/nemo/gym/latest/contribute/index.html) or [open an issue](https://github.com/NVIDIA-NeMo/Gym/issues) to discuss.

## 📋 Requirements

### Hardware Requirements
Expand Down
60 changes: 42 additions & 18 deletions docs/about/ecosystem.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,51 @@
(about-ecosystem)=
# NeMo Gym in the NVIDIA Ecosystem
# Agentic RL Ecosystem

NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/), NVIDIA's GPU-accelerated platform for building and training generative AI models.
We're building NeMo Gym to integrate with a broad set of RL training frameworks and environment libraries.

:::{tip}
For details on NeMo Gym capabilities, refer to the
{ref}`Overview <about-overview>`.
:::
We would love your contribution! Open a PR to add an integration, or [file an issue](https://github.com/NVIDIA-NeMo/Gym/issues/new/choose) to share what would be valuable for you.

---

## NeMo Gym Within the NeMo Framework
## Training Framework Integrations

NeMo Framework includes modular libraries for end-to-end model training:
- **{doc}`NeMo RL <../tutorials/nemo-rl-grpo/index>`** - GRPO training to improve multi-step tool calling on the Workplace Assistant environment
- **[OpenRLHF](https://github.com/OpenRLHF/OpenRLHF/blob/main/examples/python/agent_func_nemogym_executor.py)** - example agent executor for RL training
- **{doc}`TRL <../training-tutorials/trl>`** - GRPO training on Workplace Assistant and Reasoning Gym environments
- **{doc}`Unsloth <../tutorials/unsloth-training>`** - GRPO training on Sudoku environment, with [multi-environment notebook](https://github.com/unslothai/notebooks/blob/main/nb/NeMo-Gym-Multi-Environment.ipynb) for instruction following and reasoning gym
- **NeMo Customizer** - *(In progress)*
- **VeRL** - *(In progress)*

* **[NeMo Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge)**: Pretraining and fine-tuning with Megatron-Core
* **[NeMo AutoModel](https://github.com/NVIDIA-NeMo/Automodel)**: PyTorch native training for Hugging Face models
* **[NeMo RL](https://github.com/NVIDIA-NeMo/RL)**: Scalable and efficient post-training
* **[NeMo Gym](https://github.com/NVIDIA-NeMo/Gym)**: RL environment infrastructure and rollout collection (this project)
* **[NeMo Curator](https://github.com/NVIDIA-NeMo/Curator)**: Data preprocessing and curation
* **[NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner)**: Synthetic data generation from scratch or seed datasets
* **[NeMo Evaluator](https://github.com/NVIDIA-NeMo/Evaluator)**: Model evaluation and benchmarking
* **[NeMo Guardrails](https://github.com/NVIDIA-NeMo/Guardrails)**: Programmable safety guardrails
* And more...
To integrate another training framework, see the {doc}`Training Framework Integration Guide <../contribute/rl-framework-integration/index>`.

**NeMo Gym's Role**: Within this ecosystem, Gym focuses on standardizing scalable rollout collection for RL training. It provides unified interfaces to heterogeneous RL environments and curated resource servers with verification logic. This makes it practical to generate large-scale, high-quality training data for NeMo RL and other training frameworks.
---

## Environment Library Integrations

NeMo Gym integrates with external environment libraries and benchmarks. See the [README](https://github.com/NVIDIA-NeMo/Gym?tab=readme-ov-file#table-2-resource-servers-for-training) for the full list—here are a few examples:

- **[Reasoning Gym](https://github.com/NVIDIA-NeMo/Gym/tree/main/resources_servers/reasoning_gym)** - reasoning environments spanning computation, cognition, logic and more
- **[Aviary](https://github.com/NVIDIA-NeMo/Gym/tree/main/resources_servers/aviary)** - environments spanning math, knowledge, biological sequences, scientific literature search, and protein stability
- **[Verifiers](https://github.com/PrimeIntellect-ai/verifiers)** - *(In progress)* - environment hub for coding, data & ML, science & reasoning, tool use and more
- **[BrowserGym](https://github.com/ServiceNow/BrowserGym)** - *(In progress)* - environments for web task automation


---

## Related NeMo Libraries

NeMo Gym is a component of NVIDIA NeMo, a GPU-accelerated platform for building and training generative AI models.

Depending on your workflow, you may also find these libraries useful:

| Library | Purpose |
|---------|---------|
| [NeMo Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge) | Pretraining and fine-tuning with Megatron-Core |
| [NeMo AutoModel](https://github.com/NVIDIA-NeMo/Automodel) | PyTorch native training for Hugging Face models |
| [NeMo RL](https://github.com/NVIDIA-NeMo/RL) | Scalable post-training with GRPO, DPO, and SFT |
| **[NeMo Gym](https://github.com/NVIDIA-NeMo/Gym)** | RL environment infrastructure and rollout collection *(this project)* |
| [NeMo Curator](https://github.com/NVIDIA-NeMo/Curator) | Data preprocessing and curation |
| [NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner) | Synthetic data generation |
| [NeMo Evaluator](https://github.com/NVIDIA-NeMo/Evaluator) | Model evaluation and benchmarking |
| [NeMo Guardrails](https://github.com/NVIDIA-NeMo/Guardrails) | Programmable safety guardrails |
| [NeMo Skills](https://github.com/NVIDIA-NeMo/NeMo-Skills) | Skills framework for code generation and reasoning |