[QEff. Finetuning]: Enhance test cases to match intermediate step level loss/metrics #531

quic-akuruvil · 2025-08-05T09:49:31Z

Enable test cases for Intermediate step level loss/metric matching in single and DDP set up.

Nested dictionary structure for mapping the reference losses at different test scenarios. The test scenarios with the ref values are listed in a separate reference file.
The test scenarios at present include single device testing for below models:
Llama, Bert on Alpaca and GSM8k dataset.

REFERNCE DATA based on SDK - 1.21.0.23

tests/finetune/reference_data.py

tests/finetune/test_finetune.py

quic-meetkuma

Overall looks good, just address minor comments and we are good to merge.

QEfficient/utils/constants.py

tests/finetune/test_finetune.py

quic-meetkuma

LGTM, thanks Ann for making this change.

quic-swatia · 2025-08-14T10:05:34Z

tests/finetune/test_finetune.py

+        f"{name} length mismatch for scenario '{scenario_key}' (WS: {current_world_size}, Rank: {current_rank}). "
+        f"Expected {len(ref_list)} elements, but got {len(actual_list)}."
+    )
+    max_diff = np.max(np.abs(np.array(ref_list) - np.array(actual_list)))


In case of mismatch, it will report the max diff. It should instead report: 1) The step numbers at which deviation is happening, 2) diff in value at each of these steps. np.isclose() will help in getting the deviated indices. Before this, np.allclose() can be used to check if the assertion is passing or not.

Added step wise details for deviation

quic-swatia · 2025-08-14T10:16:04Z

tests/finetune/reference_data.py

+REFERENCE_DATA = {
+    # Scenario 1: Single-device llama 3.2-1B training on Alpaca dataset.
+    "llama_config_alpaca_single_device": {
+        "description": "Baseline for Llama on Alpaca single-device",


Please add the complete model ID here and in other configs as well.

Signed-off-by: Ann Kuruvilla <[email protected]>

quic-akuruvil self-assigned this Aug 5, 2025

quic-akuruvil requested review from quic-meetkuma and quic-swatia August 5, 2025 10:21

quic-akuruvil added the fine-tuning label Aug 5, 2025

quic-akuruvil force-pushed the test_cases branch from c9ada4a to 084bb38 Compare August 5, 2025 13:27

quic-swatia reviewed Aug 5, 2025

View reviewed changes

tests/finetune/reference_data.py Show resolved Hide resolved

tests/finetune/reference_data.py Outdated Show resolved Hide resolved

tests/finetune/test_finetune.py Outdated Show resolved Hide resolved

quic-akuruvil marked this pull request as ready for review August 5, 2025 13:54

quic-akuruvil requested review from quic-rishinr, ochougul, quic-hemagnih and quic-amitraj as code owners August 5, 2025 13:54

quic-meetkuma reviewed Aug 7, 2025

View reviewed changes

quic-akuruvil force-pushed the test_cases branch 2 times, most recently from d76ef7f to c31b88a Compare August 13, 2025 05:55

quic-meetkuma requested changes Aug 14, 2025

View reviewed changes

quic-meetkuma approved these changes Aug 14, 2025

View reviewed changes

quic-swatia reviewed Aug 14, 2025

View reviewed changes

quic-akuruvil added 12 commits August 18, 2025 05:54

Added step level metrics match

eb46f88

Signed-off-by: Ann Kuruvilla <[email protected]>

Updates in ref values

5b92e81

Signed-off-by: Ann Kuruvilla <[email protected]>

Format

c8eb5a4

Signed-off-by: Ann Kuruvilla <[email protected]>

Rebase & Format

c483f97

Signed-off-by: Ann Kuruvilla <[email protected]>

Added license

069c947

Signed-off-by: Ann Kuruvilla <[email protected]>

Review comments addressed

1154922

Signed-off-by: Ann Kuruvilla <[email protected]>

Format

7eeaa20

Signed-off-by: Ann Kuruvilla <[email protected]>

Rebase

ac6f467

Signed-off-by: Ann Kuruvilla <[email protected]>

Rebase and format

6b4cf29

Signed-off-by: Ann Kuruvilla <[email protected]>

Format

de6cd90

Signed-off-by: Ann Kuruvilla <[email protected]>

Format

70d99f4

Signed-off-by: Ann Kuruvilla <[email protected]>

Review addressed

77106f6

Signed-off-by: Ann Kuruvilla <[email protected]>

quic-akuruvil force-pushed the test_cases branch from 4b6e21d to 77106f6 Compare August 18, 2025 07:02

quic-swatia approved these changes Aug 18, 2025

View reviewed changes

quic-akuruvil merged commit d37233e into quic:main Aug 18, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QEff. Finetuning]: Enhance test cases to match intermediate step level loss/metrics #531

[QEff. Finetuning]: Enhance test cases to match intermediate step level loss/metrics #531

Uh oh!

quic-akuruvil commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-meetkuma left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-meetkuma left a comment

Uh oh!

quic-swatia Aug 14, 2025

Uh oh!

quic-akuruvil Aug 18, 2025

Uh oh!

quic-swatia Aug 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[QEff. Finetuning]: Enhance test cases to match intermediate step level loss/metrics #531

[QEff. Finetuning]: Enhance test cases to match intermediate step level loss/metrics #531

Uh oh!

Conversation

quic-akuruvil commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-meetkuma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-meetkuma left a comment

Choose a reason for hiding this comment

Uh oh!

quic-swatia Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

quic-akuruvil Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

quic-swatia Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

quic-akuruvil commented Aug 5, 2025 •

edited

Loading

quic-swatia Aug 14, 2025 •

edited

Loading