Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
a908d0b
adding secret variable in Lepton
zoeyz101 Nov 12, 2025
7fdd444
Add logs dir to container mount for ray slurm (#287)
hemildesai Jul 10, 2025
3316e76
finetune on dgxcloud with nemo-run and deploy on bedrock example (#286)
zoeyz101 Jul 10, 2025
8fc9c42
Fix skypilot archive mount bug (#288)
ri-roee Jul 10, 2025
4452a79
fix docs tutorial links and add intro to guides/index.md (#285)
hemildesai Jul 15, 2025
7561f5d
docs: Fixing doc build issue (#290)
aschilling-nv Jul 15, 2025
e4f1fb0
Add option to specify --container-env for srun (#293)
hemildesai Jul 17, 2025
0d1bc76
Use thread pool for status, run methods inside experiment + other fix…
hemildesai Jul 18, 2025
d895c1d
Fixes for multi-node execution with torchrun + LocalExecutor (#251)
pramodk Jul 19, 2025
6ded439
Upgrade skypilot to v0.10.0, introduce network_tier (#297)
ri-roee Jul 23, 2025
84e6fbf
ci: Add community-bot (#300)
ko3n1g Jul 23, 2025
39ea292
ci(fix): Use GITHUB_TOKEN for community bot (#302)
ko3n1g Jul 25, 2025
224f43e
Update release.yml (#306)
pablo-garay Jul 25, 2025
58341fc
Remove breaking torchrun config for single-node runs (#292)
ri-roee Jul 25, 2025
f4b7872
changelog workflow (#315)
pablo-garay Aug 6, 2025
9e771ed
Added Pre-Launch Commands Support to LeptonExecutor (#312)
ansjindal Aug 6, 2025
eceb774
Create CHANGELOG.md (#314)
pablo-garay Aug 7, 2025
8129d9c
Correctly append tar files for packaging (#317)
samodi-nv Aug 8, 2025
ae34ded
Add nsys patch in ray sub template (#318)
hemildesai Aug 13, 2025
f26da41
Apply '_enable_goodbye_message' check to both goodbye messages. (#319)
sudostock Aug 16, 2025
5e5e61c
Specify nodes for gpu metrics collection and split data to each rank …
ashbhandare Aug 18, 2025
18f0439
Update package_info.py (#322)
pablo-garay Aug 19, 2025
7e514e8
Add ray head start timeout (#324)
hemildesai Aug 20, 2025
21a0f65
Remove ray deprecated dashboard-grpc-port arg (#325)
chtruong814 Aug 21, 2025
f964b0a
Update community-bot to add issues to shared project (#321)
chtruong814 Aug 21, 2025
55eb97f
add a grace for Jobs that may start in Unknown (#291)
prekshivyas Aug 22, 2025
33aeef3
Add image pull secrets param for lepton (#330)
pablo-garay Aug 26, 2025
52d7ddd
Bump community-bot to 0.54.4 (#332)
chtruong814 Aug 28, 2025
2f8e581
Add broken links check in docs (#333)
chtruong814 Sep 3, 2025
1595517
Add node reservations for LeptonExecutor (#336)
roclark Sep 10, 2025
a1fbd2e
fix nodes -> num_nodes (#338)
romilbhardwaj Sep 12, 2025
76f9a7f
Add retry_until_up (#340)
romilbhardwaj Sep 12, 2025
a0d6bac
Support SkyPilot Storage configurations in `file_mounts` for automati…
andylizf Sep 12, 2025
f824406
Backward compatibility for SkyPilot 0.10.3+ (#339)
romilbhardwaj Sep 19, 2025
6e6ba36
Update cherry-pick workflow to use version 0.63.0 (#344)
pablo-garay Sep 29, 2025
e93053e
Create SkypilotJobsExecutor to allow running managed jobs (#343)
rahimftd Sep 30, 2025
fb9ecdb
Refactor tar packaging logic to work for submodule and extra repo (#347)
titu1994 Oct 3, 2025
5d1a095
Documentation Restructurting (#350)
aschilling-nv Oct 7, 2025
748d7a7
remove custom dir (#351)
ko3n1g Oct 8, 2025
7453de0
Bumping to 0.5.0 (#352)
aschilling-nv Oct 8, 2025
52d3b10
Update release notes header in changelog build (#355)
pablo-garay Oct 9, 2025
cf26040
add changelog-config (#356)
pablo-garay Oct 9, 2025
874eec9
Changelog 0.6.0 (#357)
pablo-garay Oct 9, 2025
f692e2a
spelling (#359)
pablo-garay Oct 9, 2025
51f1173
spelling (#359)
pablo-garay Oct 9, 2025
fb515e4
fix: exit code docker runs (#365)
ko3n1g Oct 15, 2025
400e1d5
new changelog-build (#367)
pablo-garay Oct 17, 2025
6acedb3
Version bump to `0.8.0rc0.dev0` (#368)
github-actions[bot] Oct 21, 2025
d39b377
feat: add copyright check (#369)
pablo-garay Oct 23, 2025
de66b3a
feat: copyright check (#370)
pablo-garay Oct 23, 2025
38feb0d
Add port parameter to SSHTunnel (#372)
Kipok Oct 25, 2025
65fff82
fix host (#373)
wedu-nvidia Oct 28, 2025
179d931
update copyright check version
pablo-garay Nov 5, 2025
69811a7
fix
pablo-garay Nov 5, 2025
2bab921
fix2
pablo-garay Nov 5, 2025
f20ff39
add copyright notice
pablo-garay Nov 5, 2025
42010c5
lintfix
pablo-garay Nov 5, 2025
66b8757
undo
pablo-garay Nov 5, 2025
4931ee4
Update ray template (#375)
hemildesai Nov 6, 2025
d7289b3
fix ray templates by using --exclusive to launch ray nodes (#380)
hemildesai Nov 8, 2025
1538fc6
fix(typo): exit_code prints empty (#379)
agronskiy Nov 10, 2025
9a294bf
fix: limit docker hostname to 32 characters (#378)
hemildesai Nov 10, 2025
6a13a75
added tests for env var
zoeyz101 Nov 12, 2025
9cf1d77
linting
zoeyz101 Nov 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .github/workflows/build-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: Build docs

on:
pull_request:
types: [opened, synchronize, reopened, labeled, unlabeled]
workflow_call:

jobs:
build-docs:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_build_docs.yml@v0.57.0
123 changes: 123 additions & 0 deletions .github/workflows/changelog-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
name: 'Changelog Build (Release)'

on:
workflow_dispatch:
inputs:
last-release-tag:
description: Last Git tag to start from (exclusive) (e.g. `v2.0.0`)
type: string
required: true
release-branch:
description: Release branch to build changelog on (e.g. `r2.1.0`)
type: string
required: true
changelog-main-content:
description: Custom changelog content to include before detailed changelogs
type: string
required: false
default: ''

jobs:
changelog:
runs-on: ubuntu-latest
steps:
- name: Checkout branch
uses: actions/checkout@v4
with:
ref: main
fetch-depth: 0

- name: Build Changelog
id: github_tag
uses: mikepenz/release-changelog-builder-action@v3.3.1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
# Configuration file is setup with filters for domains
# owner:repo must point to current repo
# fromTag: Auto resolved from historical tag order (previous tag compared to current tag)
# toTag: Current tag reference
configuration: ".github/workflows/config/changelog-config.json"
owner: ${{ github.repository_owner }}
repo: ${{ github.event.repository.name }}
ignorePreReleases: "false"
failOnError: "false"
fromTag: ${{ inputs.last-release-tag }}
toTag: ${{ inputs.release-branch }}

- name: Update changelog file
env:
RELEASE_BRANCH: ${{ inputs.release-branch }}
CHANGELOG: ${{ steps.github_tag.outputs.changelog }}
MAIN_CONTENT: ${{ inputs.changelog-main-content }}
shell: bash -x -e -u -o pipefail {0}
run: |
RELEASE_VERSION=${RELEASE_BRANCH#r}
CHANGELOG=$(echo "$CHANGELOG" | sed '/^[[:blank:]]*#/s/#/###/')

# Build release notes starting with version header
RELEASE_NOTES="## NVIDIA Nemo Run $RELEASE_VERSION"

# Add custom content if provided
if [ -n "$MAIN_CONTENT" ]; then
RELEASE_NOTES="$RELEASE_NOTES

$MAIN_CONTENT"
fi

# Add detailed changelogs section
RELEASE_NOTES="$RELEASE_NOTES

### Detailed Changelogs:

$CHANGELOG"

printf "%s\n" "$RELEASE_NOTES" | sed '/<!-- Next changelog -->/r /dev/stdin' CHANGELOG.md > CHANGELOG.tmp.md

mv CHANGELOG.tmp.md CHANGELOG.md

- name: Inspect new changelog file
run: cat CHANGELOG.md

- name: Create or update label
uses: actions/github-script@v6
with:
script: |
const labelName = '${{ inputs.release-branch }}';
const labelColor = '0366d6'; // Blue color
const labelDescription = `Release ${labelName}`;

try {
// Try to get the label
await github.rest.issues.getLabel({
owner: context.repo.owner,
repo: context.repo.repo,
name: labelName
});
console.log(`Label '${labelName}' already exists`);
} catch (error) {
if (error.status === 404) {
// Label doesn't exist, create it
await github.rest.issues.createLabel({
owner: context.repo.owner,
repo: context.repo.repo,
name: labelName,
color: labelColor,
description: labelDescription
});
console.log(`Created label '${labelName}'`);
} else {
throw error;
}
}

- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
commit-message: "beep boop: Update changelog"
title: "Update changelog for `${{ inputs.release-branch }}`"
signoff: true
sign-commits: true
base: main
branch: bot/chore/update-changelog-into-${{ inputs.release-branch }}
labels: ${{ inputs.release-branch }}
2 changes: 1 addition & 1 deletion .github/workflows/cherry-pick-release-commit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:

jobs:
cherry-pick:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_cherry_pick.yml@v0.22.7
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_cherry_pick.yml@v0.63.0
secrets:
PAT: ${{ secrets.PAT }}
SLACK_WEBHOOK_ADMIN: ${{ secrets.SLACK_WEBHOOK_ADMIN }}
Expand Down
8 changes: 8 additions & 0 deletions .github/workflows/close-inactive-issue-pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: Stale-Close-Inactive-Issues-PRs
on:
schedule:
- cron: "30 1 * * *"

jobs:
close-issues:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_close_inactive_issue_pr.yml@v0.44.0
15 changes: 15 additions & 0 deletions .github/workflows/community-bot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: Community Bot

on:
issues:
types: [opened, edited, reopened, closed, deleted]
issue_comment:
types: [created, edited, deleted]

jobs:
community-bot:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_community_bot.yml@v0.54.4
with:
community_project_id: ${{ vars.COMMUNITY_PROJECT_ID }}
secrets:
GH_TOKEN: ${{ secrets.PAT }}
118 changes: 118 additions & 0 deletions .github/workflows/config/changelog-config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
{
"categories": [
{
"title": "## Executors\n\n",
"labels": ["executor", "local", "slurm", "dgxcloud", "lepton", "skypilot", "docker"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## Ray Integration\n\n",
"labels": ["ray", "kuberay", "ray-slurm"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## CLI & Configuration\n\n",
"labels": ["cli", "config", "parsing"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## Experiment & Job Management\n\n",
"labels": ["experiment", "job", "task"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## Packaging & Deployment\n\n",
"labels": ["packaging", "deployment"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## Documentation\n\n",
"labels": ["docs", "documentation"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## CI/CD\n\n",
"labels": ["ci", "github-actions", "workflow"],
"exclude_labels": ["ignore"]
},
{
"title": "\n## Bug Fixes\n\n",
"labels": ["bug", "bugfix", "fix"],
"exclude_labels": ["ignore"]
}
],
"ignore_labels": [
"ignore",
"skip-changelog"
],
"sort": "ASC",
"template": "\n${{CHANGELOG}}\n## Others\n\n${{UNCATEGORIZED}}\n",
"pr_template": "- ${{TITLE}} [#${{NUMBER}}](${{URL}})",
"empty_template": "- No changes in this release",
"label_extractor": [
{
"pattern": "(.*executor.*)|(.*local.*)|(.*slurm.*)|(.*dgxcloud.*)|(.*lepton.*)|(.*skypilot.*)|(.*docker.*)",
"target": "executor",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*ray.*)|(.*kuberay.*)",
"target": "ray",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*cli.*)|(.*command.*)|(.*parse.*)|(.*argument.*)",
"target": "cli",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*experiment.*)|(.*job.*)|(.*task.*)",
"target": "experiment",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*packaging.*)|(.*package.*)|(.*deploy.*)|(.*archive.*)|(.*mount.*)",
"target": "packaging",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*doc.*)|(.*readme.*)|(.*guide.*)|(.*tutorial.*)",
"target": "docs",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*\\bci\\b.*)|(.*github.*)|(.*workflow.*)|(.*action.*)",
"target": "ci",
"flags": "gimu",
"on_property": ["title", "body"]
},
{
"pattern": "(.*\\[bug.*)|(.*\\bfix\\b.*)|(.*bugfix.*)|(.*patch.*)",
"target": "bug",
"flags": "gimu",
"on_property": ["title", "body"]
}
],
"duplicate_filter": {
"pattern": ".+",
"on_property": "title",
"method": "match"
},
"transformers": [
],
"max_tags_to_fetch": 100,
"max_pull_requests": 500,
"max_back_track_time_days": 365,
"exclude_merge_branches": [
],
"tag_resolver": {
"method": "semver"
}
}

23 changes: 23 additions & 0 deletions .github/workflows/copyright-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: Copyright check

on:
pull_request:
workflow_dispatch:

jobs:
copyright-check:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_copyright_check.yml@v0.54.4
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ on:
description: Branch to target for version bump
jobs:
release:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_release_library.yml@v0.22.6
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_release_library.yml@v0.40.0
with:
release-ref: ${{ inputs.release-ref }}
python-package: nemo_run
Expand Down
Loading