@@ -517,6 +517,9 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
517
517
- Fix 3d tiled online softmax ([ #162341 ] ( https://github.com/pytorch/pytorch/pull/162341 ) )
518
518
- Fix unsafe collective reorder past wait in Inductor ([ #157489 ] ( https://github.com/pytorch/pytorch/pull/157489 ) )
519
519
- Fix ` FallbackKernel ` alias function to avoid incorrect aliasing for custom ops ([ #163227 ] ( https://github.com/pytorch/pytorch/pull/163227 ) )
520
+ - Fix silent correctness w/ backpropping grads for ` FlexAttention ` ([ #163677 ] ( https://github.com/pytorch/pytorch/pull/163677 ) )
521
+ - Fix ` return_lse ` warning message in ` FlexAttention ` ([ #163578 ] ( https://github.com/pytorch/pytorch/pull/163578 ) )
522
+ - Fix ` FlexAttention ` head broadcast ([ #163426 ] ( https://github.com/pytorch/pytorch/pull/163426 ) )
520
523
521
524
## Ahead-Of-Time Inductor (AOTI)
522
525
- Fix a bug from ` load_constants ` ([ #161887 ] ( https://github.com/pytorch/pytorch/pull/161887 ) )
@@ -554,6 +557,9 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
554
557
- Fix lower opset version support in ` dynamo=True ` ([ #161056 ] ( https://github.com/pytorch/pytorch/pull/161056 ) )
555
558
- Fix ` index_put_ ` usage ([ #161263 ] ( https://github.com/pytorch/pytorch/pull/161263 ) )
556
559
560
+ ## C++ Extensions
561
+ - Fix CPP extension distributed warning for ` TORCH_CUDA_ARCH_LIST ` to only log when running on non-distributed or on rank 0 ([ #162764 ] ( https://github.com/pytorch/pytorch/pull/162764 ) )
562
+
557
563
## C++ Frontend
558
564
- Fix ` torch.utils.cpp_extension ` parser for clang version 20.1.7+libcxx ([ #157666 ] ( https://github.com/pytorch/pytorch/pull/157666 ) )
559
565
- Fix ` MakeTensor::computeStorageSize() ` calculation ([ #158690 ] ( https://github.com/pytorch/pytorch/pull/158690 ) )
@@ -591,6 +597,9 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
591
597
- Fix empty input in posneg functions ([ #161824 ] ( https://github.com/pytorch/pytorch/pull/161824 ) )
592
598
- Migrate round unary op to Metal ([ #161712 ] ( https://github.com/pytorch/pytorch/pull/161712 ) )
593
599
- Type-promote tensor-iterator common dtype ([ #160334 ] ( https://github.com/pytorch/pytorch/pull/160334 ) )
600
+ - Fix regression in 2.8.0 for ` scaled_dot_product_attention ` using MPS ([ #163598 ] ( https://github.com/pytorch/pytorch/pull/163598 ) )
601
+ - Chunk ` fillBuffer ` into 4Gb slices to avoid regression on MacOS 26 ([ #164108 ] ( https://github.com/pytorch/pytorch/pull/164108 ) )
602
+ - Fix latent bug that can result in segfault in CPP extensions ([ #164093 ] ( https://github.com/pytorch/pytorch/pull/164093 ) )
594
603
595
604
## ROCm
596
605
- Fix Inductor with cudagraph trees ` hip:0 ` device error ([ #161221 ] ( https://github.com/pytorch/pytorch/pull/161221 ) )
0 commit comments