docs: Parse Finetuning Tutorial by aasthajh · Pull Request #1471 · NVIDIA-NeMo/Automodel

aasthajh · 2026-03-06T06:54:22Z

What does this PR do ?

Adding Nemotron Parse Fine Tuning tutorial

Changelog

Add specific line by line info of high level changes in this PR.

Before your PR is "Ready for review"

Pre checks:

[ x] Make sure you read and followed Contributor guidelines
[ x] Did you add or update any necessary documentation?

If you haven't finished some of the above items you can still open "Draft" PR.

Signed-off-by: root <root@dgx-003.localdomain>

copy-pr-bot · 2026-03-06T06:54:26Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

HuiyingLi · 2026-03-06T17:54:13Z

/ok to test 09284e4

akoumpa · 2026-03-06T18:14:41Z

Hi @chenopis , can you help us with a review? Thank you

chenopis

Documentation Review: Fine-Tuning Nemotron Parse v1.1 Tutorial

Reviewed for technical accuracy (cross-referenced against examples/vlm_finetune/nemotron/nemotron_parse_v1_1.yaml and CONTRIBUTING.md), code structure, and writing style.

Findings

Severity	Count	Summary
High	3	Missing license header, wrong `distributed` config key, code duplication
Medium	4	Epoch math discrepancy, hardcoded GPU count, hyphenation, sentence fragment

Clickable “Apply suggestion” blocks are provided where the fix is unambiguous (4 of 7 comments).

Review generated with AI assistance.

examples/vlm_finetune/nemotron/parse-ft-tutorial/invoice_dataset.py

examples/vlm_finetune/nemotron/parse-ft-tutorial/parse_finetune_tutorial.ipynb

…e_tutorial.ipynb Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

…et.py Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

…e_tutorial.ipynb Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

…U config - Import json2token from invoice_dataset.py instead of redefining it in the eval helpers cell (single source of truth) - Replace embedded DATASET_PY_CONTENT with shutil.copy of co-located file - Fix max_steps from 2000 to 530 to match actual 10-epoch training - Replace hardcoded CUDA_VISIBLE_DEVICES with dynamic GPU detection - Use NUM_GPUS variable in torchrun --nproc_per_node Signed-off-by: aasthajh <aasthaj@nvidia.com> Made-with: Cursor

HuiyingLi · 2026-03-07T04:57:25Z

/ok to test 2a4574a

chenopis

LGTM

HuiyingLi · 2026-03-07T05:27:29Z

/ok to test e96fbdd

akoumpa · 2026-03-07T18:45:46Z

/ok to test 95a2d43

akoumpa · 2026-03-07T18:47:24Z

Hi @NVIDIA-NeMo/automation, secrets detector reports false positives, please help us FM this. Thank you.

Adding parse FT tutorial

09284e4

Signed-off-by: root <root@dgx-003.localdomain>

aasthajh requested review from HuiyingLi, ZhiyuLi-Nvidia, adil-a, akoumpa and hemildesai as code owners March 6, 2026 06:54

HuiyingLi changed the title ~~Parse Finetuning Tutorial~~ docs: Parse Finetuning Tutorial Mar 6, 2026

copy-pr-bot bot temporarily deployed to nemo-ci March 6, 2026 17:54 Inactive

copy-pr-bot bot temporarily deployed to test March 6, 2026 17:54 Inactive

HuiyingLi added the docs-only With great power comes great responsibility. label Mar 6, 2026

copy-pr-bot bot temporarily deployed to nemo-ci March 6, 2026 18:11 Inactive

This comment was marked as outdated.

Sign in to view

chenopis suggested changes Mar 6, 2026

View reviewed changes

copy-pr-bot bot temporarily deployed to nemo-ci March 6, 2026 18:45 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 6, 2026 19:35 Inactive

aasthajh and others added 5 commits March 6, 2026 16:52

Update examples/vlm_finetune/nemotron/parse-ft-tutorial/parse_finetun…

7ad2d97

…e_tutorial.ipynb Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

Update examples/vlm_finetune/nemotron/parse-ft-tutorial/parse_finetun…

f6cfb44

…e_tutorial.ipynb Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

Update examples/vlm_finetune/nemotron/parse-ft-tutorial/invoice_datas…

58e7120

…et.py Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

Update examples/vlm_finetune/nemotron/parse-ft-tutorial/parse_finetun…

636cb3e

…e_tutorial.ipynb Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>

aasthajh requested a review from chenopis March 7, 2026 02:01

copy-pr-bot bot temporarily deployed to nemo-ci March 7, 2026 04:57 Inactive

chenopis approved these changes Mar 7, 2026

View reviewed changes

Merge branch 'main' into feature/parse-fine-tuning-playbook

e96fbdd

copy-pr-bot bot temporarily deployed to nemo-ci March 7, 2026 05:27 Inactive

Merge branch 'main' into feature/parse-fine-tuning-playbook

95a2d43

copy-pr-bot bot temporarily deployed to nemo-ci March 7, 2026 18:46 Inactive

akoumpa enabled auto-merge (squash) March 7, 2026 18:46

Conversation

aasthajh commented Mar 6, 2026

What does this PR do ?

Changelog

Before your PR is "Ready for review"

Uh oh!

copy-pr-bot bot commented Mar 6, 2026

Uh oh!

HuiyingLi commented Mar 6, 2026

Uh oh!

akoumpa commented Mar 6, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

chenopis left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Documentation Review: Fine-Tuning Nemotron Parse v1.1 Tutorial

Findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuiyingLi commented Mar 7, 2026

Uh oh!

chenopis left a comment

Choose a reason for hiding this comment

Uh oh!

HuiyingLi commented Mar 7, 2026

Uh oh!

akoumpa commented Mar 7, 2026

Uh oh!

akoumpa commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chenopis left a comment •

edited

Loading