Skip to content

docs: Parse Finetuning Tutorial#1471

Open
aasthajh wants to merge 8 commits intoNVIDIA-NeMo:mainfrom
aasthajh:feature/parse-fine-tuning-playbook
Open

docs: Parse Finetuning Tutorial#1471
aasthajh wants to merge 8 commits intoNVIDIA-NeMo:mainfrom
aasthajh:feature/parse-fine-tuning-playbook

Conversation

@aasthajh
Copy link

@aasthajh aasthajh commented Mar 6, 2026

What does this PR do ?

Adding Nemotron Parse Fine Tuning tutorial

Changelog

  • Add specific line by line info of high level changes in this PR.

Before your PR is "Ready for review"

Pre checks:

  • [ x] Make sure you read and followed Contributor guidelines
  • [ x] Did you add or update any necessary documentation?

If you haven't finished some of the above items you can still open "Draft" PR.

Signed-off-by: root <root@dgx-003.localdomain>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@HuiyingLi HuiyingLi changed the title Parse Finetuning Tutorial docs: Parse Finetuning Tutorial Mar 6, 2026
@HuiyingLi
Copy link
Contributor

/ok to test 09284e4

@akoumpa
Copy link
Contributor

akoumpa commented Mar 6, 2026

Hi @chenopis , can you help us with a review? Thank you

chenopis

This comment was marked as outdated.

chenopis

This comment was marked as outdated.

Copy link
Contributor

@chenopis chenopis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation Review: Fine-Tuning Nemotron Parse v1.1 Tutorial

Reviewed for technical accuracy (cross-referenced against examples/vlm_finetune/nemotron/nemotron_parse_v1_1.yaml and CONTRIBUTING.md), code structure, and writing style.

Findings

Severity Count Summary
High 3 Missing license header, wrong distributed config key, code duplication
Medium 4 Epoch math discrepancy, hardcoded GPU count, hyphenation, sentence fragment

Clickable “Apply suggestion” blocks are provided where the fix is unambiguous (4 of 7 comments).

Review generated with AI assistance.

aasthajh and others added 5 commits March 6, 2026 16:52
…e_tutorial.ipynb

Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>
…e_tutorial.ipynb

Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>
…et.py

Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>
…e_tutorial.ipynb

Co-authored-by: Andrew Chen <chenopis@users.noreply.github.com>
…U config

- Import json2token from invoice_dataset.py instead of redefining it
  in the eval helpers cell (single source of truth)
- Replace embedded DATASET_PY_CONTENT with shutil.copy of co-located file
- Fix max_steps from 2000 to 530 to match actual 10-epoch training
- Replace hardcoded CUDA_VISIBLE_DEVICES with dynamic GPU detection
- Use NUM_GPUS variable in torchrun --nproc_per_node

Signed-off-by: aasthajh <aasthaj@nvidia.com>
Made-with: Cursor
@aasthajh aasthajh requested a review from chenopis March 7, 2026 02:01
@HuiyingLi
Copy link
Contributor

/ok to test 2a4574a

Copy link
Contributor

@chenopis chenopis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HuiyingLi
Copy link
Contributor

/ok to test e96fbdd

@akoumpa
Copy link
Contributor

akoumpa commented Mar 7, 2026

/ok to test 95a2d43

@akoumpa akoumpa enabled auto-merge (squash) March 7, 2026 18:46
@akoumpa
Copy link
Contributor

akoumpa commented Mar 7, 2026

Hi @NVIDIA-NeMo/automation, secrets detector reports false positives, please help us FM this. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-only With great power comes great responsibility.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants