NVIDIA-NeMo · cmunley1 · Jan 24, 2026 · Jan 24, 2026 · Jan 24, 2026 · Jan 28, 2026
diff --git a/docs/about/architecture.md b/docs/about/architecture.md
@@ -25,7 +25,7 @@ Model servers expose OpenAI-compatible inference endpoints for chat and response
 - `POST /v1/chat/completions`
 - `POST /v1/responses`
 
-The base model server class defines these endpoints. Concrete model servers implement them (for example, the OpenAI-backed model server). Agents call these endpoints through the shared server client.
+The base model server class defines these endpoints. Concrete model servers implement them (for example, the OpenAI or vLLM model server). Agents call these endpoints through the shared server client.
 
 ### Resources servers (environment + verification)
 
@@ -36,14 +36,15 @@ Resources servers expose environment lifecycle endpoints:
 
 Individual resources servers can add domain-specific endpoints for tools or environment steps. For example:
 
-- A resources server can register a catch-all tool route like `POST /{path}` for tool execution.
-- Aviary-based resources servers add `POST /step` and `POST /close` for multi-step environments.
+- Individual tools as `POST /get_weather` or `POST /search`
+- A resources server can register a catch-all tool route like `POST /{path}` for dynamic environments.
+- Supports `POST /step` and `POST /close` for Gymnasium-style environments .
 
 ### Agent servers (rollout orchestration)
 
 Agent servers expose two primary endpoints:
 
-- `POST /v1/responses` for multi-step interaction
+- `POST /v1/responses` for individual generations
 - `POST /run` for full rollout execution and verification
 
 The base agent server class wires these routes, while each agent implementation defines how to call model and resources servers.
@@ -63,19 +64,15 @@ The shared server client fetches the resolved configuration from the head server
 
 The `SimpleAgent` implementation orchestrates a complete rollout and verification sequence:
 
-1. Call the resources server `POST /seed_session` to initialize session state.
-2. Call the agent `POST /v1/responses`. The agent calls the model server `POST /v1/responses` and issues tool calls to the resources server via `POST /{tool_name}`.
-3. Call the resources server `POST /verify` and return the verified rollout response.
+1. Call the resources server `POST /seed_session` to initialize environment state.
+2. Call the agent `POST /v1/responses`. The agent calls the model server `POST /v1/responses` and issues tool calls to the resources server via `POST /{tool_name}` to interact with the environment.
+3. Call the resources server `POST /verify` and return the rollout and reward.
 
 The rollout collection flow uses the agent `POST /run` endpoint and writes the returned metrics to JSONL output.
 
-### Multi-step environments (Aviary example)
-
-Some resources servers model environments with explicit step and close endpoints. Aviary-based resources servers accept `POST /step` for environment transitions and `POST /close` to release an environment instance.
-
 ## Session and State
 
-All servers add session handling that assigns a session ID when one is not present. Agents propagate cookies between model and resources servers, which lets resources servers store per-session state. Several resources servers keep in-memory maps keyed by session ID (for example, counters or tool environments) to track environment state across steps.
+All servers add session handling that assigns a session ID on initialization. Agents propagate cookies between model and resources servers, which lets resources servers store per-session state. Several resources servers keep in-memory maps keyed by session ID (for example, counters or tool environments) to track environment state across steps.
 
 ## Configuration and Port Resolution
 

diff --git a/docs/contribute/rl-framework-integration/index.md b/docs/contribute/rl-framework-integration/index.md
@@ -8,9 +8,22 @@ These guides cover how to integrate NeMo Gym into a new RL training framework. U
 - Contributing NeMo Gym integration for a training framework that does not have one yet
 
 :::{tip}
-Just want to train models? Use {ref}`NeMo RL <training-nemo-rl-grpo-index>` instead.
+Just want to train models? See existing integrations:
+- {ref}`NeMo RL <training-nemo-rl-grpo-index>` - Multi-step and multi-turn RL training at scale
+- {doc}`TRL (Hugging Face) <../training-tutorials/trl>` - GRPO training with distributed training support
+- {doc}`Unsloth <../training-tutorials/unsloth>` - Fast, memory-efficient training for single-step tasks
 :::
 
+## Existing Integrations
+
+NeMo Gym currently integrates with the following RL training frameworks:
+
+**[NeMo RL](https://github.com/NVIDIA-NeMo/RL)**: NVIDIA's RL training framework, purpose-built for large-scale frontier model training. Provides full support for multi-step and multi-turn environments with production-grade distributed training capabilities.
+
+**[TRL](https://github.com/huggingface/trl)**: Hugging Face's transformer reinforcement learning library. Supports GRPO with single and multi-turn NeMo Gym environments using vLLM generation, multi-environment training, and distributed training via Accelerate and DeepSpeed. See the {doc}`TRL tutorial <../training-tutorials/trl>` for usage examples.
+
+**[Unsloth](https://github.com/unslothai/unsloth)**: Fast, memory-efficient fine-tuning library. Supports optimized GRPO with single-step NeMo Gym environments including low precision, parameter-efficient fine-tuning, and training in notebook environments. See the {doc}`Unsloth tutorial <../training-tutorials/unsloth>` for getting started.
+
 ## Prerequisites
 
 Before integrating Gym into your training framework, ensure you have:

diff --git a/docs/index.md b/docs/index.md
@@ -418,8 +418,8 @@ Rollout Collection <get-started/rollout-collection.md>
 🟡 Nemotron Nano <training-tutorials/nemotron-nano>
 🟡 Nemotron Super <training-tutorials/nemotron-super>
 NeMo RL GRPO <tutorials/nemo-rl-grpo/index.md>
-Unsloth Training <tutorials/unsloth-training>
 🟡 TRL <training-tutorials/trl>
+🟡 Unsloth <training-tutorials/unsloth>
 🟡 VERL <training-tutorials/verl>
 🟡 NeMo Customizer <training-tutorials/nemo-customizer>
 Offline Training <tutorials/offline-training-w-rollouts>