-
Notifications
You must be signed in to change notification settings - Fork 157
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Version
0.11.0
On which installation method(s) does this occur?
uv
Describe the issue
With latest install of version 0.11.0 using torch2.10.0+cu128, the example notebook at examples/16_cbottle_super_resolution.py fails with an OoM issue. This was attempted with an A100 40GB GPU on Derecho at NSF NCAR. The same example works with a H100 80GB.
I am not sure if the latest updates to pytorch or E2S broke this but can anyone else reproduce this? I've tried insertions of torch.cuda.empty_cache() as well as setting PYTORCH_ALLOC_CONF=expandable_segments:True plus some other PYTORCH_ALLOC_CONF options to no avail on the A100s with the latest versions installed. I can try testing against 0.10.0 and torch 2.9.1+cu129 when I have some time.
My uv environment is below for reference
dhoward@deg0073:/glade/work/dhoward/E2S> uv pip list
Package Version
---------------------------- -----------------
absl-py 2.3.1
affine 2.4.0
aiobotocore 3.1.1
aiofiles 25.1.0
aiohappyeyeballs 2.6.1
aiohttp 3.13.3
aioitertools 0.13.0
aiosignal 1.4.0
anemoi-inference 0.4.9
anemoi-models 0.3.1
anemoi-transform 0.1.20
anemoi-utils 0.4.43
aniso8601 10.0.1
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.12.1
anytree 2.13.0
array-api-compat 1.13.0
asttokens 3.0.1
attrs 25.4.0
azure-core 1.38.0
azure-storage-blob 12.28.0
bokeh 3.8.2
boto3 1.42.30
botocore 1.42.30
bracex 2.6
cartopy 0.25.0
cattrs 25.3.0
cbottle 2025.5.1
cdsapi 0.7.7
certifi 2026.1.4
cffi 2.0.0
cfgrib 0.9.15.1
cftime 1.6.5
cfunits 3.3.7
charset-normalizer 3.4.4
chex 0.1.91
click 8.3.1
cligj 0.7.2
cloudpickle 3.1.2
colabtools 0.0.1
coloredlogs 15.0.1
comm 0.2.3
contourpy 1.3.3
cryptography 46.0.3
cucim-cu12 25.12.0
cuda-bindings 12.9.4
cuda-pathfinder 1.3.3
cuda-python 12.9.4
cuda-toolkit 12.8.1
cupy-cuda12x 13.6.0
cuvs-cu12 25.12.0
cycler 0.12.1
dacite 1.9.2
dask 2026.1.1
debugpy 1.8.19
decorator 5.2.1
deprecation 2.1.0
dinosaur-dycore 1.2.1
distributed 2026.1.1
dm-haiku 0.0.16
dm-tree 0.1.9
donfig 0.8.1.post1
earth2grid 2025.7.1+torch210
earth2studio 0.11.0rc0
earthkit-data 0.18.4
earthkit-geo 0.4.0
earthkit-meteo 0.5.0
earthkit-regrid 0.4.0
earthkit-utils 0.1.2
eccodes 2.44.0
eccodeslib 2.44.1.8
eckitlib 1.32.4.8
ecmwf-datastores-client 0.4.2
ecmwf-opendata 0.3.26
editorconfig 0.17.1
einops 0.8.1
entrypoints 0.4
etils 1.13.0
executing 2.2.1
fastrlock 0.8.3
fckitlib 0.14.1.8
filelock 3.20.3
findlibs 0.1.2
flash-attn 2.8.3
flatbuffers 25.12.19
flax 0.12.0
flexcache 0.3
flexparser 0.4
fme 2025.10.0
fonttools 4.61.1
frozenlist 1.8.0
fsspec 2026.1.0
gcsfs 2026.1.0
gitdb 4.0.12
gitpython 3.1.46
globus-sdk 3.65.0
google-api-core 2.29.0
google-auth 2.47.0
google-auth-oauthlib 1.2.4
google-cloud-core 2.5.0
google-cloud-storage 3.8.0
google-cloud-storage-control 1.9.0
google-crc32c 1.8.0
google-resumable-media 2.8.0
googleapis-common-protos 1.72.0
graphcast 0.2.0.dev0
grpc-google-iam-v1 0.14.3
grpcio 1.76.0
grpcio-status 1.76.0
h11 0.16.0
h5netcdf 1.8.0
h5py 3.14.0
hf-xet 1.2.0
httpcore 1.0.9
httpx 0.28.1
huggingface-hub 1.3.3
humanfriendly 10.0
humanize 4.15.0
hydra-core 1.3.2
idna 3.11
imageio 2.37.2
imageio-ffmpeg 0.6.0
importlib-metadata 8.7.1
importlib-resources 6.5.2
intake-esgf 2026.1.8
iprogress 0.4
ipykernel 7.1.0
ipython 9.9.0
ipython-pygments-lexers 1.1.1
ipywidgets 8.1.8
isodate 0.7.2
jax 0.7.1
jax-cuda12-pjrt 0.7.1
jax-cuda12-plugin 0.7.1
jaxlib 0.7.1
jedi 0.19.2
jinja2 3.1.6
jmespath 1.1.0
jmp 0.0.4
joblib 1.5.3
jraph 0.0.6.dev0
jsbeautifier 1.15.4
jsonschema 4.26.0
jsonschema-specifications 2025.9.1
jupyter-client 8.8.0
jupyter-core 5.9.1
jupyterlab-widgets 3.0.16
kiwisolver 1.4.9
lark 1.3.1
lazy-loader 0.4
libcuvs-cu12 25.12.0
libraft-cu12 25.12.0
librmm-cu12 25.12.0
lightning-utilities 0.15.2
llvmlite 0.46.0
locket 1.0.0
loguru 0.7.3
lru-dict 1.4.1
lz4 4.4.5
makani 0.2.0
markdown 3.10.1
markdown-it-py 4.0.0
markupsafe 3.0.3
matplotlib 3.10.8
matplotlib-inline 0.2.1
mdurl 0.1.2
microsoft-aurora 1.8.0
ml-dtypes 0.5.4
more-itertools 10.8.0
moviepy 2.2.1
mpmath 1.3.0
msgpack 1.1.2
multi-storage-client 0.40.0
multidict 6.7.0
multiurl 0.3.7
narwhals 2.15.0
nest-asyncio 1.6.0
netcdf4 1.7.2
networkx 3.6.1
ninja 1.13.0
numba 0.63.1
numcodecs 0.14.1
numpy 2.3.5
nvidia-cublas-cu12 12.8.4.1
nvidia-cuda-cupti-cu12 12.8.90
nvidia-cuda-nvcc-cu12 12.9.86
nvidia-cuda-nvrtc-cu12 12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.3.3.83
nvidia-cufile-cu12 1.13.1.3
nvidia-curand-cu12 10.3.9.90
nvidia-cusolver-cu12 11.7.3.90
nvidia-cusparse-cu12 12.5.8.93
nvidia-cusparselt-cu12 0.7.1
nvidia-ml-py 13.590.48
nvidia-nccl-cu12 2.27.5
nvidia-nvimgcodec-cu12 0.6.1.37
nvidia-nvjitlink-cu12 12.8.93
nvidia-nvshmem-cu12 3.4.5
nvidia-nvtx-cu12 12.8.90
nvidia-physicsnemo 1.3.0
nvsmi 0.4.2
nvtx 0.2.14
oauthlib 3.3.1
omegaconf 2.3.0
onnx 1.20.1
onnxruntime-gpu 1.23.2
opentelemetry-api 1.39.1
opt-einsum 3.4.0
optax 0.2.6
orbax-checkpoint 0.11.32
packaging 26.0
pandas 3.0.0
parso 0.8.5
partd 1.4.2
pdbufr 0.14.1
pexpect 4.9.0
pillow 11.3.0
pint 0.25.2
planetary-computer 1.0.0
platformdirs 4.5.1
plotly 6.5.2
prettytable 3.17.0
proglog 0.1.12
prompt-toolkit 3.0.52
propcache 0.4.1
proto-plus 1.27.0
protobuf 6.33.4
psutil 7.2.1
ptyprocess 0.7.0
pure-eval 0.2.3
pyarrow 23.0.0
pyasn1 0.6.2
pyasn1-modules 0.4.2
pycparser 3.0
pydantic 2.12.5
pydantic-core 2.41.5
pygments 2.19.2
pyjwt 2.10.1
pylibraft-cu12 25.12.0
pynvml 13.0.1
pyparsing 3.3.2
pyproj 3.7.2
pyshp 3.0.3
pystac 1.14.3
pystac-client 0.9.0
python-dateutil 2.9.0.post0
python-dotenv 1.2.1
pytz 2025.2
pyyaml 6.0.3
pyzmq 27.1.0
rapids-logger 0.2.3
rasterio 1.5.0
rdkit 2025.9.3
referencing 0.37.0
requests 2.32.5
requests-cache 1.2.1
requests-oauthlib 2.0.0
rich 14.2.0
rioxarray 0.20.0
rmm-cu12 25.12.0
rpds-py 0.30.0
rsa 4.9.1
rtree 1.4.1
ruamel-yaml 0.19.1
s3fs 2026.1.0
s3transfer 0.16.0
safetensors 0.7.0
scikit-image 0.25.2
scikit-learn 1.8.0
scipy 1.17.0
semantic-version 2.10.0
sentry-sdk 2.50.0
setuptools 80.10.1
shapely 2.1.2
shellingham 1.5.4
simplejson 3.20.2
six 1.17.0
smmap 5.0.2
sortedcontainers 2.4.0
soundfile 0.13.1
stack-data 0.6.3
sympy 1.14.0
tabulate 0.9.0
tblib 3.2.2
tensorboard 2.20.0
tensorboard-data-server 0.7.2
tensorly 0.9.0
tensorly-torch 0.5.0
tensorstore 0.1.80
termcolor 3.3.0
threadpoolctl 3.6.0
tifffile 2026.1.14
timm 1.0.24
toolz 1.1.0
torch 2.10.0+cu128
torch-geometric 2.4.0
torch-harmonics 0.8.0
torchmetrics 1.8.2
torchvision 0.25.0+cu128
tornado 6.5.4
tqdm 4.67.1
traitlets 5.14.3
tree-math 0.2.1
treelib 1.8.0
treescope 0.1.10
trimesh 4.11.1
triton 3.6.0
typer-slim 0.21.1
typing-extensions 4.15.0
typing-inspection 0.4.2
tzdata 2025.3
url-normalize 2.2.1
urllib3 2.6.3
wandb 0.24.0
wcmatch 10.1
wcwidth 0.3.1
werkzeug 3.1.5
widgetsnbextension 4.0.15
wrapt 2.0.1
xarray 2025.12.0
xarray-tensorstore 0.3.0
xattr 1.3.0
xyzservices 2025.11.0
yarl 1.22.0
zarr 3.1.5
zict 3.0.0
zipp 3.23.0Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working