IFU dev v2.6 #374

wangye805 · 2025-11-19T15:13:35Z

Description

targeted NV upstream commit: ca7407e on 2025/07/18 based on our rocm dev commit 6bbd03c
Fixes https://github.com/ROCm/frameworks-internal/issues/13729

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

See NV upstream release doc for upstream changes.
Our IFU conflict resolving are listed in the following commits:
1). common: 4d3ca4d
2). jax extension: 5ce0afd
3). pytorch extension: c9c9126
4). build/installation: 9730903
5). cpp gtests: 51bdbb8
6). pytorch pytests: ba59f81
7). jax pytests: 5842c24

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Przemek Tredak <[email protected]>

* tests drop Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move dir Signed-off-by: Pawel Gadzinski <[email protected]> * tests fox Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Pawel Gadzinski <[email protected]> Signed-off-by: Przemek Tredak <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Przemek Tredak <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

* Fix README render on PyPI Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Update README.rst Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Use anonymous hyperlink for duplicate. Fix indent. Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

* Check tensor-recipe compatibility Signed-off-by: Evgeny Tsykunov <[email protected]> * Tensor class in recipe, checking for *Base Signed-off-by: Evgeny Tsykunov <[email protected]> * Extend recipe __repr__ with recipe_type Signed-off-by: Evgeny Tsykunov <[email protected]> * Warn about recipe change Signed-off-by: Evgeny Tsykunov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Enable dynamic recipe change: clear fp8 workspace Signed-off-by: Evgeny Tsykunov <[email protected]> * TE 1.x checkpoint compatibility Signed-off-by: Evgeny Tsykunov <[email protected]> * Disable warning for recipe wrappers Signed-off-by: Evgeny Tsykunov <[email protected]> * Test recipe change Signed-off-by: Evgeny Tsykunov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Use QuantizedTensorBase Signed-off-by: Evgeny Tsykunov <[email protected]> * Fix circular import Signed-off-by: Evgeny Tsykunov <[email protected]> * Revert previous circular import fix Signed-off-by: Evgeny Tsykunov <[email protected]> * Fix pytorch imports in common Signed-off-by: Evgeny Tsykunov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Let quantizer know about the recipe Signed-off-by: Evgeny Tsykunov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix imports Signed-off-by: Evgeny Tsykunov <[email protected]> --------- Signed-off-by: Evgeny Tsykunov <[email protected]> Signed-off-by: Kirthi Shankar Sivamani <[email protected]> Co-authored-by: Przemyslaw Tredak <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

* Fix split_overlap_rs aggregate=True chunk offset calculation Signed-off-by: Guyue Huang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add unit test for aggregate=True Signed-off-by: Guyue Huang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix unit test Signed-off-by: Guyue Huang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Guyue Huang <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

…te (#1799) * Use an empty torch tensor to indicate no fp8 information in extra_state Signed-off-by: Peter St. John <[email protected]> * Add huggingface from_pretrained / save_pretrained tests Adds integration tests to ensure models containing TransformerLayer objects can be saved and loaded using the from_pretrained and save_pretrained methods. Signed-off-by: Peter St. John <[email protected]> --------- Signed-off-by: Peter St. John <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

…n (#1611) * docs drop Signed-off-by: Pawel Gadzinski <[email protected]> * a Signed-off-by: Pawel Gadzinski <[email protected]> * fix Signed-off-by: Pawel Gadzinski <[email protected]> * Update docs/debug/1_getting_started.rst Co-authored-by: Przemyslaw Tredak <[email protected]> Signed-off-by: Paweł Gadziński <[email protected]> * Update docs/debug/1_getting_started.rst Co-authored-by: Przemyslaw Tredak <[email protected]> Signed-off-by: Paweł Gadziński <[email protected]> * fixes Signed-off-by: Pawel Gadzinski <[email protected]> * fix imgs Signed-off-by: Pawel Gadzinski <[email protected]> --------- Signed-off-by: Pawel Gadzinski <[email protected]> Signed-off-by: Paweł Gadziński <[email protected]> Co-authored-by: Przemyslaw Tredak <[email protected]>

add docstring for CP Signed-off-by: Charlene Yang <[email protected]>

* Add missing docs for C API Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Grammar, typos, copy-paste errors Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * remove contiguous word Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Better wording Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

* fix model parallel encoder to be properly sharded Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Sudhakar Singh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

fix saved_tensors Signed-off-by: Pawel Gadzinski <[email protected]>

Fix incorrectly skipped test_quantize_dbias tests Signed-off-by: Jeremy Berchtold <[email protected]>

Remove comm_gemm_overlap docs Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

* Build support for cuda 13 Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Fix build for cudnn 8.9*; cuda 12.1 Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * readd include Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

…rity (#1811) Make primitive names more granular for better disabling granularity Signed-off-by: Jeremy Berchtold <[email protected]>

Document all recipes Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

…#1804) Activation ops support fusing backward pass with quantize Signed-off-by: Tim Moon <[email protected]>

* Fix env variable name in test.sh scripts to properly test pure-JAX implementations Signed-off-by: Jeremy Berchtold <[email protected]> * Update test scripts to use pure-JAX impl in encoder test_custom_call_compute.py already uses pure-JAX impl as reference so testing the pure-JAX impl against itself would be redundant. The encoder tests have their own implementation so testing the pure-JAX impl of primitives is still useful. Signed-off-by: Jeremy Berchtold <[email protected]> * Update qa/L0_jax_unittest/test.sh Co-authored-by: Phuong Nguyen <[email protected]> Signed-off-by: jberchtold-nvidia <[email protected]> --------- Signed-off-by: Jeremy Berchtold <[email protected]> Signed-off-by: jberchtold-nvidia <[email protected]> Co-authored-by: Phuong Nguyen <[email protected]>

* Modify the test cases Signed-off-by: Przemek Tredak <[email protected]> * Make the tests reproducible on different machines Signed-off-by: Przemek Tredak <[email protected]> * Fixed the cache of the gamma_in_weight_dtype setting Signed-off-by: Przemek Tredak <[email protected]> * Reinstate the tests Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * More verbose code and comments Signed-off-by: Przemek Tredak <[email protected]> --------- Signed-off-by: Przemek Tredak <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* added conda installation Signed-off-by: Santosh Bhavani <[email protected]> * fix for pypi Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Santosh Bhavani <[email protected]> Signed-off-by: Kirthi Shankar Sivamani <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

* Fix single FW build with multi FW available Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Some fixes Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * Fixes Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * sug Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

…ion (#1822) Update jax_scaled_masked_softmax to match TE kernel implementation Signed-off-by: Jeremy Berchtold <[email protected]>

* fp8 gemm with direct quant Signed-off-by: Phuong Nguyen <[email protected]> --------- Signed-off-by: Phuong Nguyen <[email protected]>

* removes unnecessary reshapes for FP8 GEMM * use nn.jax.scaled_matmul Signed-off-by: Phuong Nguyen <[email protected]> --------- Signed-off-by: Phuong Nguyen <[email protected]>

…needed (#1817) * Linear op avoids saving input tensor if weight grad is not needed Signed-off-by: Tim Moon <[email protected]> * Linear op forward avoids producing quantized tensors with unnecessary usages Signed-off-by: Tim Moon <[email protected]> * Fix linter warnings Signed-off-by: Tim Moon <[email protected]> * Avoid unnecessary usages in fused linear ops Signed-off-by: Tim Moon <[email protected]> --------- Signed-off-by: Tim Moon <[email protected]>

…#1813) * Changed the Tensor allocation strategy Signed-off-by: Przemek Tredak <[email protected]> * Fixes Signed-off-by: Przemek Tredak <[email protected]> * Disable debug flag Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix the double free error Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Przemek Tredak <[email protected]> * Fixed pyTorch recipe extension Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Przemek Tredak <[email protected]> * Fix Signed-off-by: Przemek Tredak <[email protected]> * Hide TensorAllocator and fix the usage in LayerNorm Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleaning Signed-off-by: Przemek Tredak <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Przemek Tredak <[email protected]> * Fix permutation Signed-off-by: Przemek Tredak <[email protected]> --------- Signed-off-by: Przemek Tredak <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Support SWA in CP Ring Attn THD striped sharding Signed-off-by: Hua Huang <[email protected]> * Add some comments; move check to _FusedAttnCPWithP2PHelper.check_supported() Signed-off-by: Hua Huang <[email protected]> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Remove unused check Signed-off-by: Hua Huang <[email protected]> --------- Signed-off-by: Hua Huang <[email protected]>

Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

* Quantizer update Signed-off-by: Evgeny Tsykunov <[email protected]> * Update import Signed-off-by: Evgeny <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Introduce _update_weight_quantizers and _get_weight_tensors/_get_weight_quantizers Signed-off-by: Evgeny <[email protected]> * Add test Signed-off-by: Evgeny <[email protected]> * Move _quantizer to the QuantizedTensorBase Signed-off-by: Evgeny <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix import Signed-off-by: Evgeny Tsykunov <[email protected]> --------- Signed-off-by: Evgeny Tsykunov <[email protected]> Signed-off-by: Evgeny <[email protected]> Co-authored-by: Evgeny Tsykunov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Przemyslaw Tredak <[email protected]>

…talled (#1834) * Add warning for multi framework case Signed-off-by: Kirthi Shankar Sivamani <[email protected]> Co-authored-by: Alp Dener <[email protected]> * fix Signed-off-by: Kirthi Shankar Sivamani <[email protected]> --------- Signed-off-by: Kirthi Shankar Sivamani <[email protected]> Co-authored-by: Alp Dener <[email protected]>

transformer_engine/common/CMakeLists.txt

transformer_engine/common/fused_router/fused_moe_aux_loss.cu

transformer_engine/common/fused_router/utils.h

transformer_engine/common/normalization/rmsnorm/rmsnorm_api.cpp

transformer_engine/common/transformer_engine.cpp

transformer_engine/common/normalization/common.cpp

tests/cpp/operator/test_cast_mxfp8_gated_swiglu.cu

build_tools/wheel_utils/build_wheels.sh

build_tools/pytorch.py

tests/cpp/operator/test_normalization.h

transformer_engine/common/fused_attn_rocm/fused_attn_aotriton.cpp

transformer_engine/common/gemm/rocm_gemm.cu

transformer_engine/common/normalization/common.h

transformer_engine/common/util/cuda_runtime.cpp

transformer_engine/common/util/handle_manager.h

transformer_engine/common/normalization/rmsnorm/rmsnorm_api.cpp

tests/cpp/operator/test_normalization.h

transformer_engine/jax/csrc/extensions/ffi.cpp

transformer_engine/jax/quantize/device_utils.py

transformer_engine/jax/pyproject.toml

alextmagro · 2025-11-21T18:06:44Z

LGTM -- only covered common dir and cpp tests.

transformer_engine/pytorch/module/fp8_padding.py

transformer_engine/pytorch/triton_kernels/norm_common.py

build_tools/wheel_utils/build_wheels.sh

build_tools/pytorch.py

tests/cpp/operator/test_normalization.h

transformer_engine/common/fused_router/fused_topk_with_score_function.cu

transformer_engine/common/util/cuda_runtime.h

pyproject.toml

transformer_engine/pytorch/setup.py

transformer_engine/jax/pyproject.toml

ci/pytorch.sh

Micky774

I mainly focused on the attention sections, and the install/build. Overall looks good, just some minor nits/questions.

Micky774 · 2025-11-26T18:05:34Z

tests/pytorch/fused_attn/test_fused_attn.py

        config.window_size = [2, 2]
    config.window_size = check_set_window_size(config.attn_mask_type, config.window_size)
+
+    is_training = True


Why do we not use is_training = config.head_dim_qk <= 192 and config.head_dim_v <= 128 as in line 400 later in this function for determining the available backend? Won't this potentially cause issues if the later is_training=False, where we could have had a certain backend enabled at this step but didn't because we assumed is_training=True? Or is that not a problem?

Missed this conflict with git merge. Fixed. Thanks

Micky774 · 2025-11-26T18:27:56Z

transformer_engine/jax/csrc/extensions/attention.cpp

+#endif
+// ROCm fused attn has two backends: aotriton and ck
+// They both have the same shape and stride for softmax and rng aux tensors
+// CK now support bias features


Indent to keep aligned

Done. Thanks

Micky774 · 2025-11-26T18:58:36Z

pyproject.toml

+# See LICENSE for license information.
+
+[build-system]
+requires = ["setuptools>=61.0", "cmake>=3.21", "wheel", "pybind11[global]", "ninja", "pip", "torch>=2.1", "jax[cuda12]", "flax>=0.7.1"]


To clarify, this means that when building TE one must have both JAX and PyTorch installed in order to build even just for a single framework right?

I'm not quite familiar with this new pyproject.toml. Based on my experience with pip install --no-build-isolation, the source build does not interact with this pyproject.toml in my mind.

Please correct me if I was wrong

…ca7407e-6bbd03c

AllenFarcas · 2025-12-03T19:42:05Z

transformer_engine/pytorch/module/layernorm_linear.py

For lines 369-371

# Deallocate GEMM input tensor if no longer needed if not weight.requires_grad and not return_layernorm_output: ln_out = ln_out_total = None clear_tensor_data(ln_out, ln_out_total)

we shouldn't make ln_out and ln_out_total None first, we should clear tensor data first and then make them none

Also my commit ([Feat] Add transpose cache to LayerNorm kernel (#279) ) had instead:

if not weight.requires_grad: if not return_layernorm_output: clear_tensor_data(ln_out, ln_out_total) ln_out = None

Done. Thanks

ipanfilo · 2025-12-04T01:04:26Z

ci/pytorch.sh

    #_WORKERS_COUNT=$TEST_WORKERS
+    mkdir -p ${TEST_DIR}/checkpoint
+    python ${TEST_DIR}/test_checkpoint.py --save-checkpoint all --checkpoint-dir ${TEST_DIR}/checkpoint
+    NVTE_TEST_CHECKPOINT_ARTIFACT_PATH=${TEST_DIR}/checkpoint run 1 test_checkpoint.py


I think test_checkpoint does not involve calling different fused_attn backends so whole this addition should be under if [ $_fus_attn = "$_DEFAULT_FUSED_ATTN" ]

In fact, I did find this test involves attention:

TransformerEngine/tests/pytorch/test_checkpoint.py

Line 64 in a00236c

return te.TransformerLayer(1, 1, 1)

It indeed creates TransformerLayer but it neither calls fwd nor bwd so FA backend is called. It tests saving of Torch.nn.module derivative classes state save/loading. In fact, even Flash vs Fused attn does not make difference in the state

I see. Done. Thanks

…s comments

…ca7407e-6bbd03c

ptrendx and others added 30 commits May 16, 2025 17:17

Changed VERSION to 2.5.0.dev0

1d903f5

Signed-off-by: Przemek Tredak <[email protected]>

[PyTorch] Add docstring for CP load balancing (#1802)

d35afe1

add docstring for CP Signed-off-by: Charlene Yang <[email protected]>

[PyTorch] Fix saved_tensors access in Ops Fuser (#1807)

9c436d5

fix saved_tensors Signed-off-by: Pawel Gadzinski <[email protected]>

[JAX] Fix incorrectly skipped test_quantize_dbias tests (#1808)

0cd1cd8

Fix incorrectly skipped test_quantize_dbias tests Signed-off-by: Jeremy Berchtold <[email protected]>

Remove comm_gemm_overlap doc (#1815)

6262280

Remove comm_gemm_overlap docs Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

[JAX] Make primitive names more granular for better disabling granula…

b17f3f4

…rity (#1811) Make primitive names more granular for better disabling granularity Signed-off-by: Jeremy Berchtold <[email protected]>

Add docs for missing FP8 recipes. (#1816)

1669b3f

Document all recipes Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

[PyTorch] Activation ops support fusing backward pass with quantize (…

e4c051f

…#1804) Activation ops support fusing backward pass with quantize Signed-off-by: Tim Moon <[email protected]>

[JAX] Update jax_scaled_masked_softmax to match TE kernel implementat…

4732ed7

…ion (#1822) Update jax_scaled_masked_softmax to match TE kernel implementation Signed-off-by: Jeremy Berchtold <[email protected]>

[JAX] FP8 GEMM via dot_general + direct quant (#1819)

355c4e4

* fp8 gemm with direct quant Signed-off-by: Phuong Nguyen <[email protected]> --------- Signed-off-by: Phuong Nguyen <[email protected]>

[JAX] Removes unneccessary reshapes for FP8 GEMM (#1820)

c9e8e30

* removes unnecessary reshapes for FP8 GEMM * use nn.jax.scaled_matmul Signed-off-by: Phuong Nguyen <[email protected]> --------- Signed-off-by: Phuong Nguyen <[email protected]>

Avoid searching unnecessary dirs for shared libs (#1801)

204add8

Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Kirthi Shankar Sivamani <[email protected]>

wangye805 requested review from AllenFarcas, Micky774, VeeraRajasekhar and alextmagro November 19, 2025 15:22

alextmagro reviewed Nov 19, 2025

View reviewed changes

ipanfilo reviewed Nov 20, 2025

View reviewed changes

[ROCm] fix the example conflict and address reviewer comments

c3a9517

wangye805 requested review from alextmagro and ipanfilo November 21, 2025 06:04

[ROCm] merge dev to commit 653b5b4

aaceb5a

alextmagro approved these changes Nov 21, 2025

View reviewed changes

ipanfilo reviewed Nov 23, 2025

View reviewed changes

alextmagro added 2 commits November 24, 2025 16:56

add missing thread id to __shfl_xor

52ebfc6

add fused_router tests to ci

6aeee85

ipanfilo reviewed Nov 25, 2025

View reviewed changes

ci/pytorch.sh Outdated Show resolved Hide resolved

change test_fused_router.py to run_default_fa

00e6af9

Micky774 approved these changes Nov 26, 2025

View reviewed changes

wangye805 added 3 commits December 2, 2025 23:28

[ROCm] resolve the difference with nv upstream

ce2b2ef

Merge remote-tracking branch 'origin/dev' into IFU-dev-20250718-v2.6-…

a533b8c

…ca7407e-6bbd03c

Merge remote-tracking branch 'origin/dev' into IFU-dev-20250718-v2.6-…

a00236c

…ca7407e-6bbd03c

wangye805 requested review from Micky774 and ipanfilo December 3, 2025 18:52

AllenFarcas reviewed Dec 3, 2025

View reviewed changes

ipanfilo requested changes Dec 4, 2025

View reviewed changes

wangye805 added 3 commits December 4, 2025 22:41

[ROCm] skip the newly failed two-stage amax pytest and address Allen'…

6682f9b

…s comments

[ROCm] run test_checkpoint.py only under default fa backend

4f69692

Merge remote-tracking branch 'origin/dev' into IFU-dev-20250718-v2.6-…

53250a0

…ca7407e-6bbd03c

ipanfilo approved these changes Dec 5, 2025

View reviewed changes

IFU dev v2.6 #374

Are you sure you want to change the base?

IFU dev v2.6 #374

Uh oh!

Conversation

wangye805 commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Changes

Checklist:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alextmagro commented Nov 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Micky774 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

wangye805 commented Nov 19, 2025 •

edited

Loading