Skip to content

Commit 93194b6

Browse files
committed
updates
1 parent ea086e8 commit 93194b6

File tree

1 file changed

+48
-67
lines changed

1 file changed

+48
-67
lines changed

2.9.0/final.md

Lines changed: 48 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -24,32 +24,23 @@ Below are the full release notes for this release.
2424

2525
The minimum version of Python required for PyTorch 2.9.0 is 3.10.
2626

27-
## Build Frontend
28-
29-
### Remove `/d2implyavx512upperregs` flag that slows build ([#159431](https://github.com/pytorch/pytorch/pull/159431))
30-
31-
### Add `ScalarType` to shim conversion and `stable::Tensor.scalar_type` ([#160557](https://github.com/pytorch/pytorch/pull/160557))
32-
33-
Before, user extensions could only in abstract pass around obfuscated dtypes appearing as `int32_ts`. Now, users can confidently use `torch::headeronly::ScalarType` in their extensions for major scalar types. This PR enables ABI stability by adding a translation layer through the shim, so that even if the `ScalarType` enum values change in the future, user extensions need not fear.
34-
35-
This is narrowly BC breaking for unpopular dtypes: `quint*`s, `qint*`s, `Bits*`, `dummy_uint*`s, `dummy_int*`s, `Float8_e8m0fnu`, and `Float4_e2m1fn_x2` in the use case where an extension retrieves a Tensor dtype of the above and passes it into `aoti_torch_call_dispatcher`.
36-
37-
## Export
38-
### Switch off runtime asserts by default in favor of a shape guards function ([#160111](https://github.com/pytorch/pytorch/pull/160111), [#161178](https://github.com/pytorch/pytorch/pull/161178), [#161794](https://github.com/pytorch/pytorch/pull/161794))
27+
## Build metal kernels of MacOS-14+ and remove all pre-MacOS-14 specific logic, requires MacOS-14+ going forward ([\#159733](https://github.com/pytorch/pytorch/pull/159733), [\#159912](https://github.com/pytorch/pytorch/pull/159912))
3928

29+
PyTorch MPS is only supported on MacOS-14 or later. If you need to use MPS on MacOS Ventura, please avoid updating to Python-3.9 or above
4030

41-
To enable runtime asserts, use `export(..., prefer_deferred_runtime_asserts_over_guards=True)`. Also kills the `allow_complex_guards_as_runtime_asserts` flag, merging it into the former option.
31+
## Upgrade to DLPack 1.0 ([#145000](https://github.com/pytorch/pytorch/pull/145000))
4232

33+
This upgrade is doing the same BC-breaking changes as the DLPack release.
34+
Objects in `torch.utils.dlpack` have been updated to reflect these changes, such as `DLDeviceType`.
35+
See the PR for details on the exact changes and how to update your code.
4336

44-
Additionally, `exported_program.module()` will generate a call to a `_guards_fn` submodule that will run additional checks on inputs. Users who do not want this behavior can either remove this call in the graph, or do `exported_program.module(check_guards=False)` to avoid the generation.
37+
## Raise appropriate errors in `torch.cat` ([#158249](https://github.com/pytorch/pytorch/pull/158249))
4538

46-
## MPS
47-
### Build metal kernels of MacOS-14+ and remove all pre-MacOS-14 specific logic, requires MacOS-14+ going forward ([\#159733](https://github.com/pytorch/pytorch/pull/159733), [\#159912](https://github.com/pytorch/pytorch/pull/159912))
39+
Raising `ValueError`, `IndexError` or `TypeError` where appropriate instead of the generic `RuntimeError`.
40+
If you code was catching these error, you can update to catch the new error type.
4841

49-
PyTorch MPS is only supported on MacOS-14 or later. If you need to use MPS on MacOS Ventura, please avoid updating to Python-3.9 or above
5042

51-
## ONNX
52-
### Default to `dynamo=True` for ONNX exporter ([#159646](https://github.com/pytorch/pytorch/pull/159646), [#162726](https://github.com/pytorch/pytorch/pull/162726))
43+
## Default to `dynamo=True` for ONNX exporter ([#159646](https://github.com/pytorch/pytorch/pull/159646), [#162726](https://github.com/pytorch/pytorch/pull/162726))
5344

5445
Previously `torch.onnx.export(...)` used the legacy TorchScript exporter if no arguments were provied. The ONNX exporter now uses the newer `torch.export.export` pipeline by default (`dynamo=True`). This change improves graph fidelity and future-proofs exports, but may surface graph capture errors that were previously masked or handled differently.
5546

@@ -73,7 +64,15 @@ torch.onnx.export(...)
7364
Recommendation: first try the new default; only fall back if you hit blocking issues and report them upstream.
7465
Long term solution: fix the root cause instead of relying on fallback or TorchScript exporter.
7566

76-
### Set default opset to 20 ([#158802](https://github.com/pytorch/pytorch/pull/158802))
67+
## Switch off runtime asserts by default in favor of a shape guards function ([#160111](https://github.com/pytorch/pytorch/pull/160111), [#161178](https://github.com/pytorch/pytorch/pull/161178), [#161794](https://github.com/pytorch/pytorch/pull/161794))
68+
69+
70+
To enable runtime asserts, use `export(..., prefer_deferred_runtime_asserts_over_guards=True)`. Also kills the `allow_complex_guards_as_runtime_asserts` flag, merging it into the former option.
71+
72+
73+
Additionally, `exported_program.module()` will generate a call to a `_guards_fn` submodule that will run additional checks on inputs. Users who do not want this behavior can either remove this call in the graph, or do `exported_program.module(check_guards=False)` to avoid the generation.
74+
75+
## Set default opset to 20 ([#158802](https://github.com/pytorch/pytorch/pull/158802))
7776

7877
Opset 20 enables newer operator definitions. If your tooling or downstream runtime only supports opset 18, pin it explicitly. For the latest ONNX operators, you can experiment with opset 23.
7978

@@ -97,7 +96,7 @@ torch.onnx.export(...)
9796
torch.onnx.export(..., opset_version=23)
9897
```
9998

100-
### Drop `draft_export` in exporter API ([#161454](https://github.com/pytorch/pytorch/pull/161454), [#162225](https://github.com/pytorch/pytorch/pull/162225))
99+
## Drop `draft_export` in exporter API ([#161454](https://github.com/pytorch/pytorch/pull/161454), [#162225](https://github.com/pytorch/pytorch/pull/162225))
101100

102101
Remove implicit draft tracing from the default exporter path, achieving clearer behaviour and faster failures.
103102
The expensive `torch.export.draft_export` diagnostic path is no longer auto-invoked (which could take hours on large models). You can still opt in for deep diagnostics:
@@ -125,45 +124,41 @@ Now in torch 2.9.0:
125124
TORCH_ONNX_ENABLE_DRAFT_EXPORT=True python export_to_onnx.py
126125
```
127126

128-
### Remove `torch.onnx.dynamo_export` and the `onnxrt` torch compile backend ([#158130](https://github.com/pytorch/pytorch/pull/158130), [#158258](https://github.com/pytorch/pytorch/pull/158258))
127+
## Remove `torch.onnx.dynamo_export` and the `onnxrt` torch compile backend ([#158130](https://github.com/pytorch/pytorch/pull/158130), [#158258](https://github.com/pytorch/pytorch/pull/158258))
129128

130129
`torch.onnx.dynamo_export` is removed. Please use `torch.onnx.export` instead.
131130
The experimental ONNX Runtime compile backend (`torch.compile(backend="onnxrt")`) is no longer supported.
132131

133-
### Remove `torch.onnx.enable_fake_mode` ([#161222](https://github.com/pytorch/pytorch/pull/161222))
132+
## Remove `torch.onnx.enable_fake_mode` ([#161222](https://github.com/pytorch/pytorch/pull/161222))
134133

135134
The `dynamo=True` mode uses `FakeTensor`s by default which is memory efficient.
136135

137-
### Some public facing utility APIs for the TorchScript based exporter are now private ([#161323](https://github.com/pytorch/pytorch/pull/161323))
138-
### Remove `torch.onnx.symbolic_caffe2` ([#157102](https://github.com/pytorch/pytorch/pull/157102))
136+
## Some public facing utility APIs for the TorchScript based exporter are now private ([#161323](https://github.com/pytorch/pytorch/pull/161323))
139137

140-
## Python Frontend
141-
### Upgrade to DLPack 1.0. ([#145000](https://github.com/pytorch/pytorch/pull/145000))
138+
Deprecated members in `torch.onnx.verification` are removed. Previously private `torch.onnx.symbolic_opsets*` functions will no longer be accessible. Consider making a copy of the source code if you need to access any private functions for compatibility with the TorchScript based exporter.
142139

143-
This upgrade is doing the same BC-breaking changes as the DLPack release.
144-
Objects in `torch.utils.dlpack` have been updated to reflect these changes, such as `DLDeviceType`.
145-
See the PR for details on the exact changes and how to update your code.
140+
## Remove `torch.onnx.symbolic_caffe2` ([#157102](https://github.com/pytorch/pytorch/pull/157102))
146141

147-
### Raise appropriate errors in `torch.cat` ([#158249](https://github.com/pytorch/pytorch/pull/158249))
142+
Support for `caffe2` in the ONNX exporter has ended and is removed.
148143

149-
Raising `ValueError`, `IndexError` or `TypeError` where appropriate instead of the generic `RuntimeError`.
150-
If you code was catching these error, you can update to catch the new error type.
144+
## Remove `/d2implyavx512upperregs` flag that slows build ([#159431](https://github.com/pytorch/pytorch/pull/159431))
151145

152-
# Deprecations
153-
## Dataloader Frontend
154-
### Deprecate `pin_memory_device` param in `torch.utils.data.DataLoader` ([#158323](https://github.com/pytorch/pytorch/pull/158323))
146+
Re-introduced AVX512 optimizations for Windows VS2022 builds, may cause issues with specific versions of VS2022, see [#145702](https://github.com/pytorch/pytorch/issues/145702)
155147

156-
We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required for `StatefulDataloader` which leveraged `BaseDataLoaderIter` direclty rather than the `Dataloader` class init
148+
## Add `ScalarType` to shim conversion and `stable::Tensor.scalar_type` ([#160557](https://github.com/pytorch/pytorch/pull/160557))
157149

158-
## Export
159-
### Deprecation for `export_for_training` API, in favor of equivalent `export` API ([#158203](https://github.com/pytorch/pytorch/pull/158203))
150+
Before, user extensions could only in abstract pass around obfuscated dtypes appearing as `int32_ts`. Now, users can confidently use `torch::headeronly::ScalarType` in their extensions for major scalar types. This PR enables ABI stability by adding a translation layer through the shim, so that even if the `ScalarType` enum values change in the future, user extensions need not fear.
160151

161-
`export_for_training` exists because we couldn't migrate internal usages of export to the final IR. Now that we have completed the migration, we deprecated and deleted this API.
152+
This change adds ScalarType support for user extensions and is only narrowly BC breaking for unpopular dtypes: `quint*`s, `qint*`s, `Bits*`, `dummy_uint*`s, `dummy_int*`s, `Float8_e8m0fnu`, and `Float4_e2m1fn_x2` in the use case where an extension retrieves a Tensor dtype of the above and passes it into `aoti_torch_call_dispatcher`.
162153

163-
## Release Engineering
164-
### Remove Python 3.9 support in CD builds. Move CI to Python 3.10.([#161427](https://github.com/pytorch/pytorch/pull/161427)) ([#162265](https://github.com/pytorch/pytorch/pull/162265)) ([#162297](https://github.com/pytorch/pytorch/pull/162297)) ([#160852](https://github.com/pytorch/pytorch/pull/160852))
154+
# Deprecations
155+
## Deprecate `pin_memory_device` param in `torch.utils.data.DataLoader` ([#158323](https://github.com/pytorch/pytorch/pull/158323))
165156

166-
### Remove CUDA 12.9 support in CD builds ([#161916](https://github.com/pytorch/pytorch/pull/161916))
157+
We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required for `StatefulDataloader` which leveraged `BaseDataLoaderIter` direclty rather than the `Dataloader` class init
158+
159+
## Deprecate `torch.export.export_for_training` API in favor of equivalent `torch.export.export` API ([#158203](https://github.com/pytorch/pytorch/pull/158203))
160+
161+
`torch.export.export_for_training` exists because we couldn't migrate internal usages of export to the final IR. Now that we have completed the migration, we deprecated and deleted this API.
167162

168163
# New Features
169164
## AOTDispatcher
@@ -174,29 +169,12 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
174169
- Add `zero_()` and `empty_like(t)` to `torch/csrc/stable/ops.h` ([#158866](https://github.com/pytorch/pytorch/pull/158866))
175170

176171
## C++ Extensions
177-
- Add pad and narrow to `torch/csrc/stable/ops.h` ([#159328](https://github.com/pytorch/pytorch/pull/159328))
178-
- Add `getCurrentDeviceIndex` to `torch::stable::accelerator` ([#160453](https://github.com/pytorch/pytorch/pull/160453))
179-
- Add `new_zeros` dtype variant to the shim and as a stable op ([#161597](https://github.com/pytorch/pytorch/pull/161597))
180-
- Update `torch::stable::Tensor()` default constructor ([#159507](https://github.com/pytorch/pytorch/pull/159507))
181-
- Add beginnings of `torch::stable::accelerator` ([#159679](https://github.com/pytorch/pytorch/pull/159679))
182-
- Port `amax` to stable ABI ([#160214](https://github.com/pytorch/pytorch/pull/160214))
183-
- Add `new_empty` (with dtype argument only) to `torch::stable` ([#159508](https://github.com/pytorch/pytorch/pull/159508))
184-
- Enable generating generic `c_shim` that doesn't bypass dispatcher ([#158974](https://github.com/pytorch/pytorch/pull/158974))
185-
- Cut a version of `TORCH_ERROR_CODE_CHECK` in `headeronly` from AOTI ([#159604](https://github.com/pytorch/pytorch/pull/159604))
186-
- Check F2C BLAS for OpenBLAS and other vendors ([#143846](https://github.com/pytorch/pytorch/pull/143846))
187-
- Add an ovrsource target for `torch/headeronly` ([#157912](https://github.com/pytorch/pytorch/pull/157912))
188-
- Migrate `c10/macros/cmake_macros.h.in` to `torch/headeronly` ([#158035](https://github.com/pytorch/pytorch/pull/158035))
189-
- Move `c10/macros/Macros.h` to `headeronly` ([#158365](https://github.com/pytorch/pytorch/pull/158365))
190-
- Add `STD_TORCH_CHECK` to `headeronly` ([#158377](https://github.com/pytorch/pytorch/pull/158377))
191-
- Migrate easy q(u)int/bits stuff to `torch/headeronly` ([#159302](https://github.com/pytorch/pytorch/pull/159302))
192-
- Move `Float4` to `headeronly` ([#159414](https://github.com/pytorch/pytorch/pull/159414))
193-
- Move `BFloat16.h` to `headeronly` ([#159412](https://github.com/pytorch/pytorch/pull/159412))
194-
- Move `Float8` variations to `headeronly` ([#159415](https://github.com/pytorch/pytorch/pull/159415))
195-
- Move complex to `headeronly` ([#159411](https://github.com/pytorch/pytorch/pull/159411))
196-
- Migrate `ScalarType` to `headeronly` ([#159911](https://github.com/pytorch/pytorch/pull/159911))
197-
- Add stable Tensor `get_device_index`, use more stable `DeviceIndex` ([#160143](https://github.com/pytorch/pytorch/pull/160143))
198-
- Add `is_cpu` method to stable tensor type ([#160212](https://github.com/pytorch/pytorch/pull/160212))
172+
- Build out a stable set of ATen ops in `torch/csrc/stable/ops.h`: `amax`, `narrow`, `new_empty` + `new_zeros` dtype variant, `pad`, ([#159328](https://github.com/pytorch/pytorch/pull/159328), [#158974](https://github.com/pytorch/pytorch/pull/158974), [#159508](https://github.com/pytorch/pytorch/pull/159508), [#161597](https://github.com/pytorch/pytorch/pull/161597), [#160214](https://github.com/pytorch/pytorch/pull/160214), )
173+
- Add `torch::stable::Tensor()` default constructor, `is_cpu`, and `get_device_index`([#159507](https://github.com/pytorch/pytorch/pull/159507), [#160212](https://github.com/pytorch/pytorch/pull/160212), [#160143](https://github.com/pytorch/pytorch/pull/160143))
174+
- Add beginnings of `torch::stable::accelerator` with support for DeviceGuard and Stream ([#159679](https://github.com/pytorch/pytorch/pull/159679), [#160453](https://github.com/pytorch/pytorch/pull/160453))
175+
- Start building out `torch/headeronly`: c10 Macros, STD_TORCH_CHECK, ScalarTypes (like BFloat16 and Half) ([#158035](https://github.com/pytorch/pytorch/pull/158035), [#158365](https://github.com/pytorch/pytorch/pull/158365), [#157912](https://github.com/pytorch/pytorch/pull/157912), [#158377](https://github.com/pytorch/pytorch/pull/158377), [#159302](https://github.com/pytorch/pytorch/pull/159302), [#159414](https://github.com/pytorch/pytorch/pull/159414), [#159412](https://github.com/pytorch/pytorch/pull/159412), [#159415](https://github.com/pytorch/pytorch/pull/159415), [#159411](https://github.com/pytorch/pytorch/pull/159411), [#159911](https://github.com/pytorch/pytorch/pull/159911))
199176
- Remove cmake cache and reconfigure again if it is invalid ([#156958](https://github.com/pytorch/pytorch/pull/156958))
177+
- Cut a version of `TORCH_ERROR_CODE_CHECK` in `headeronly` from AOTI ([#159604](https://github.com/pytorch/pytorch/pull/159604))
200178
- Remove `wheel` from build requirements ([#158027](https://github.com/pytorch/pytorch/pull/158027))
201179
- Error when `TORCH_STABLE_ONLY` is defined in `TensorBase.h` ([#161658](https://github.com/pytorch/pytorch/pull/161658))
202180

@@ -248,8 +226,7 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
248226
- Add `torch.hash_tensor` reduction function ([#154149](https://github.com/pytorch/pytorch/pull/154149))
249227

250228
## Quantization
251-
- Enable cpu fp8 qlinear ([#155678](https://github.com/pytorch/pytorch/pull/155678))
252-
- Enable cpu fp8 qconv ([#157076](https://github.com/pytorch/pytorch/pull/157076))
229+
- Enable cpu fp8 qlinear and cpu fp8 qconv ([#155678](https://github.com/pytorch/pytorch/pull/155678), [#157076](https://github.com/pytorch/pytorch/pull/157076))
253230

254231
## Release Engineering
255232
- Add support for CUDA 13.0 in CI/CD builds. Enable CUDA compression mode for binary size reduction for CUDA 13.0 builds ([#160956](https://github.com/pytorch/pytorch/pull/160956)) ([#161073](https://github.com/pytorch/pytorch/pull/161073)) ([#161257](https://github.com/pytorch/pytorch/pull/161257)) ([#161663](https://github.com/pytorch/pytorch/pull/161663)) ([#161316](https://github.com/pytorch/pytorch/pull/161316)) ([#160201](https://github.com/pytorch/pytorch/pull/160201)) ([#160770](https://github.com/pytorch/pytorch/pull/160770)) ([#161013](https://github.com/pytorch/pytorch/pull/161013)) ([#161916](https://github.com/pytorch/pytorch/pull/161916)) ([#162268](https://github.com/pytorch/pytorch/pull/162268)) ([#162322](https://github.com/pytorch/pytorch/pull/162322)) ([#162383](https://github.com/pytorch/pytorch/pull/162383)) ([#161833](https://github.com/pytorch/pytorch/pull/161833))
@@ -283,6 +260,8 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
283260
- Fix dev warning in `Dependencies.cmake` ([#159702](https://github.com/pytorch/pytorch/pull/159702))
284261
- Fix building system gloo with CUDA/HIP ([#146637](https://github.com/pytorch/pytorch/pull/146637))
285262
- Build `libtorch` without NVSHMEM ([#160910](https://github.com/pytorch/pytorch/pull/160910))
263+
- Improve BLAS feature detection ([#143846](https://github.com/pytorch/pytorch/pull/143846))
264+
286265

287266
## Composability
288267
- Meta implementation for `aten.add.Scalar` ([#161332](https://github.com/pytorch/pytorch/pull/161332))
@@ -498,13 +477,15 @@ We move enabling `pin_memory` back inside `BaseDataLoaderIter`. This is required
498477
- Implement workaround for `cudaErrorNotSupported` ([#162412](https://github.com/pytorch/pytorch/pull/162412))
499478
- Fix missing `__syncthreads` in MultiMarginLoss backward ([#158994](https://github.com/pytorch/pytorch/pull/158994))
500479
- Roll-back cuDNN frontend upgrade and update Meta registration due to compile issues ([#163104](https://github.com/pytorch/pytorch/pull/163104))
480+
- Disable cuDNN for 3D convolutions with `kernel size != 1` for cuDNN 9.8+ ([#163581](https://github.com/pytorch/pytorch/pull/163581))
501481

502482
## Distributed
503483
### c10d
504484
- Fix slow init due to repeated dns resolution failure in socket ([#159596](https://github.com/pytorch/pytorch/pull/159596))
505485
- Fix `setGroupName` and `setGroupDesc` in `group_split` and `merge_remote_group` ([#159429](https://github.com/pytorch/pytorch/pull/159429))
506486
- Fix a bug of distributed 'gather' with noncontiguous tensors on the Gloo backend ([#158903](https://github.com/pytorch/pytorch/pull/158903))
507487
- Fix a bug of distributed 'gather' with noncontiguous tensors on the NCCL backend ([#159549](https://github.com/pytorch/pytorch/pull/159549))
488+
- Fix data inconsistencies when using `batch_isend_irecv` with 2D tensor views by making P2P tensors dense ([#163719](https://github.com/pytorch/pytorch/pull/163719))
508489
### Device Mesh
509490
- Fix the not incorrectly chained each of the strings as iterables ([#160709](https://github.com/pytorch/pytorch/pull/160709))
510491
### DistributedDataParallel (DDP)

0 commit comments

Comments
 (0)