-
Notifications
You must be signed in to change notification settings - Fork 60
ecosystem pg verbiage update #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,7 +2,7 @@ | |
|
|
||
| NeMo Gym is a library for building reinforcement learning (RL) training environments for large language models (LLMs). It provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework. | ||
|
|
||
| NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/), NVIDIA’s GPU-accelerated platform for building and training generative AI models. | ||
| NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/). For details on how NeMo Gym fits within the NeMo ecosystem and integrates with other RL frameworks, see the [Ecosystem](https://docs.nvidia.com/nemo/gym/latest/about/ecosystem.html) documentation. | ||
|
|
||
|
|
||
| ## 🏆 Why NeMo Gym? | ||
|
|
@@ -16,6 +16,34 @@ NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/n | |
| > [!IMPORTANT] | ||
| > NeMo Gym is currently in early development. You should expect evolving APIs, incomplete documentation, and occasional bugs. We welcome contributions and feedback - for any changes, please open an issue first to kick off discussion! | ||
|
|
||
| ## 🔗 Ecosystem Integrations | ||
|
|
||
| NeMo Gym is designed to integrate seamlessly with the broader RL ecosystem. For detailed documentation, see the [Ecosystem](https://docs.nvidia.com/nemo/gym/latest/about/ecosystem.html) page. | ||
|
|
||
| ### Training Frameworks | ||
|
|
||
| NeMo Gym provides rollout collection infrastructure that integrates with various RL training frameworks: | ||
|
|
||
| | Framework | Status | Description | | ||
| |-----------|--------|-------------| | ||
| | [NeMo RL](https://github.com/NVIDIA-NeMo/RL) | ✅ Supported | NVIDIA's scalable post-training library with GRPO, DPO, SFT | | ||
| | [Unsloth](https://github.com/unslothai/unsloth) | ✅ Supported | Fast fine-tuning framework with memory optimization | | ||
| | [TRL](https://github.com/huggingface/trl) | ✅ Supported | Hugging Face Transformer Reinforcement Learning | | ||
| | [veRL](https://github.com/volcengine/verl) | 🔜 In Progress | Volcano Engine's scalable RL framework | | ||
|
|
||
| ### Environment Libraries | ||
|
|
||
| NeMo Gym integrates with environment libraries for diverse training scenarios. All integrations are compatible with OpenAI Gymnasium standards. | ||
|
|
||
| | Library | Status | Description | | ||
| |---------|--------|-------------| | ||
| | [reasoning-gym](https://github.com/open-thought/reasoning-gym) | ✅ Supported | Procedurally generated reasoning tasks (see `reasoning_gym` resource server) | | ||
| | [Aviary](https://github.com/Future-House/aviary) | ✅ Supported | Multi-environment framework for tool-using agents (see `aviary` resource server) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. may be worth saying its openai gymnasium compatible (but we should double confirm that statement) Prime intellect - the library is named verifiers, or environments hub, not prime intelelct itself, imo browsergym - not sure if anyone is working on this? @cwing-nvidia ?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I renamed to verifiers in latest commits. browser gym integration is being worked on by Marc Cuevas |
||
| | [Verifiers](https://github.com/PrimeIntellect-ai/verifiers) | 🔜 In Progress | Environment hub for coding, data & ML, science & reasoning, tool use and more | | ||
| | [BrowserGym](https://github.com/ServiceNow/BrowserGym) | 🔜 In Progress | Web browsing and automation environments | | ||
|
|
||
| > 💡 **Want to add an integration?** We welcome contributions! See our [Contributing Guide](https://docs.nvidia.com/nemo/gym/latest/contribute/index.html) or [open an issue](https://github.com/NVIDIA-NeMo/Gym/issues) to discuss. | ||
|
|
||
| ## 📋 Requirements | ||
|
|
||
| ### Hardware Requirements | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,27 +1,51 @@ | ||
| (about-ecosystem)= | ||
| # NeMo Gym in the NVIDIA Ecosystem | ||
| # Agentic RL Ecosystem | ||
|
|
||
| NeMo Gym is a component of the [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/), NVIDIA's GPU-accelerated platform for building and training generative AI models. | ||
| We're building NeMo Gym to integrate with a broad set of RL training frameworks and environment libraries. | ||
|
|
||
| :::{tip} | ||
| For details on NeMo Gym capabilities, refer to the | ||
| {ref}`Overview <about-overview>`. | ||
| ::: | ||
| We would love your contribution! Open a PR to add an integration, or [file an issue](https://github.com/NVIDIA-NeMo/Gym/issues/new/choose) to share what would be valuable for you. | ||
|
|
||
| --- | ||
|
|
||
| ## NeMo Gym Within the NeMo Framework | ||
| ## Training Framework Integrations | ||
|
|
||
| NeMo Framework includes modular libraries for end-to-end model training: | ||
| - **{doc}`NeMo RL <../tutorials/nemo-rl-grpo/index>`** - GRPO training to improve multi-step tool calling on the Workplace Assistant environment | ||
| - **[OpenRLHF](https://github.com/OpenRLHF/OpenRLHF/blob/main/examples/python/agent_func_nemogym_executor.py)** - example agent executor for RL training | ||
| - **{doc}`TRL <../training-tutorials/trl>`** - GRPO training on Workplace Assistant and Reasoning Gym environments | ||
| - **{doc}`Unsloth <../tutorials/unsloth-training>`** - GRPO training on Sudoku environment, with [multi-environment notebook](https://github.com/unslothai/notebooks/blob/main/nb/NeMo-Gym-Multi-Environment.ipynb) for instruction following and reasoning gym | ||
| - **NeMo Customizer** - *(In progress)* | ||
| - **VeRL** - *(In progress)* | ||
|
|
||
| * **[NeMo Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge)**: Pretraining and fine-tuning with Megatron-Core | ||
| * **[NeMo AutoModel](https://github.com/NVIDIA-NeMo/Automodel)**: PyTorch native training for Hugging Face models | ||
| * **[NeMo RL](https://github.com/NVIDIA-NeMo/RL)**: Scalable and efficient post-training | ||
| * **[NeMo Gym](https://github.com/NVIDIA-NeMo/Gym)**: RL environment infrastructure and rollout collection (this project) | ||
| * **[NeMo Curator](https://github.com/NVIDIA-NeMo/Curator)**: Data preprocessing and curation | ||
| * **[NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner)**: Synthetic data generation from scratch or seed datasets | ||
| * **[NeMo Evaluator](https://github.com/NVIDIA-NeMo/Evaluator)**: Model evaluation and benchmarking | ||
| * **[NeMo Guardrails](https://github.com/NVIDIA-NeMo/Guardrails)**: Programmable safety guardrails | ||
| * And more... | ||
| To integrate another training framework, see the {doc}`Training Framework Integration Guide <../contribute/rl-framework-integration/index>`. | ||
|
|
||
| **NeMo Gym's Role**: Within this ecosystem, Gym focuses on standardizing scalable rollout collection for RL training. It provides unified interfaces to heterogeneous RL environments and curated resource servers with verification logic. This makes it practical to generate large-scale, high-quality training data for NeMo RL and other training frameworks. | ||
| --- | ||
|
|
||
| ## Environment Library Integrations | ||
|
|
||
| NeMo Gym integrates with external environment libraries and benchmarks. See the [README](https://github.com/NVIDIA-NeMo/Gym?tab=readme-ov-file#table-2-resource-servers-for-training) for the full list—here are a few examples: | ||
|
|
||
| - **[Reasoning Gym](https://github.com/NVIDIA-NeMo/Gym/tree/main/resources_servers/reasoning_gym)** - reasoning environments spanning computation, cognition, logic and more | ||
| - **[Aviary](https://github.com/NVIDIA-NeMo/Gym/tree/main/resources_servers/aviary)** - environments spanning math, knowledge, biological sequences, scientific literature search, and protein stability | ||
| - **[Verifiers](https://github.com/PrimeIntellect-ai/verifiers)** - *(In progress)* - environment hub for coding, data & ML, science & reasoning, tool use and more | ||
| - **[BrowserGym](https://github.com/ServiceNow/BrowserGym)** - *(In progress)* - environments for web task automation | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ## Related NeMo Libraries | ||
|
|
||
| NeMo Gym is a component of NVIDIA NeMo, a GPU-accelerated platform for building and training generative AI models. | ||
|
|
||
| Depending on your workflow, you may also find these libraries useful: | ||
|
|
||
| | Library | Purpose | | ||
lbliii marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| |---------|---------| | ||
| | [NeMo Megatron-Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge) | Pretraining and fine-tuning with Megatron-Core | | ||
| | [NeMo AutoModel](https://github.com/NVIDIA-NeMo/Automodel) | PyTorch native training for Hugging Face models | | ||
| | [NeMo RL](https://github.com/NVIDIA-NeMo/RL) | Scalable post-training with GRPO, DPO, and SFT | | ||
| | **[NeMo Gym](https://github.com/NVIDIA-NeMo/Gym)** | RL environment infrastructure and rollout collection *(this project)* | | ||
| | [NeMo Curator](https://github.com/NVIDIA-NeMo/Curator) | Data preprocessing and curation | | ||
| | [NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner) | Synthetic data generation | | ||
| | [NeMo Evaluator](https://github.com/NVIDIA-NeMo/Evaluator) | Model evaluation and benchmarking | | ||
| | [NeMo Guardrails](https://github.com/NVIDIA-NeMo/Guardrails) | Programmable safety guardrails | | ||
| | [NeMo Skills](https://github.com/NVIDIA-NeMo/NeMo-Skills) | Skills framework for code generation and reasoning | | ||
Uh oh!
There was an error while loading. Please reload this page.