Skip to content

Conversation

@christian-wojek
Copy link
Contributor

@christian-wojek christian-wojek commented Oct 31, 2025

A malformed file with (unfiltered) chunks can contain inconsistent chunk block size (on disk) with respect to the chunk size defined by the layout. This may cause too little heap memory to be allocated upon reading the chunk from disk (ultimately causing an overflow in H5VM_memcpyvv).

This PR adds a check to ensure consistency and fixes CVE-2025-44904.

Closes #5892


Important

Fixes CVE-2025-44904 by adding a consistency check for chunk sizes in H5D__chunk_lock() in H5Dchunk.c.

  • Security Fix:
    • Adds a check in H5D__chunk_lock() in H5Dchunk.c to ensure unfiltered data chunks are not smaller on disk than in memory.
    • Fixes CVE-2025-44904 by preventing potential memory overflow in H5VM_memcpyvv due to inconsistent chunk sizes.
  • Error Handling:
    • Triggers HGOTO_ERROR with H5E_IO and H5E_READERROR if chunk size inconsistency is detected.

This description was created by Ellipsis for 153d7b1. You can customize this summary. It will automatically update as commits are pushed.

@christian-wojek christian-wojek marked this pull request as ready for review October 31, 2025 10:05
mattjala
mattjala previously approved these changes Oct 31, 2025
@jhendersonHDF
Copy link
Collaborator

Hi @christian-wojek, @glennsong09 is currently looking into this issue and we expect a proper fix to be a bit more involved, as there are several other places where the buffer that is eventually passed to H5VM_memcpyvv could be allocated from without the library retaining information about how big that buffer is.

fortnern and others added 10 commits November 8, 2025 14:46
Change default file format to 1.8 across various tests and examples, updating file creation and access logic accordingly.

Behavior:
Default file format version changed to 1.8 in H5Pfapl.c.
Updated file creation and access to use 1.8 format in h5ex_g_compact.c and test_file_image.c.
Set earliest file format in multiple test files including cache_tagging.c, dtypes.c, and links.c.
Tests:
Modified expected output in tools/test/misc/expected/*.ls files to reflect new file format locations.
Adjusted test logic in test_file_image.c and cache_tagging.c to accommodate format changes.
Misc:
Added comments and TODOs for future format testing in test_file_image.c.
Minor variable renaming for clarity in test_file_image.c.
)

* Change new chunk indexing methods to always encode chunk size as a 64
bit (size of lengths) integer, when using the 2.0 file format.

* Add CHANGELOG.md note

* Spelling

* Fix errors in parallel build

* Committing clang-format changes

* More parallel fixes.

* Committing clang-format changes

* Another parallel fix

* Fix parallel for real this time I hope

* Update function descriptions in dsets.c

* Fix spelling

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
* Combined INSTALL_parallel and README_HPC into a single, README_HPC.md file.
* Removed all autotools references
* Updated All References
In order to reduce hash collisions and take advantage of modern memory capacity, the default hash table size for the chunk cache has been increased from 521 to 8191. This means the hash table will consume approximately 64 KiB per open dataset. This value can be changed with H5Pset_cache() or H5Pset_chunk_cache(). This value was chosen because it is a prime number close to 8K.
Also includes a new workflow to test FreeBSD, which has a different qsort signature and was previously untested
Resolves HDFGroup#5896
Bumps the github-actions group with 8 updates:

| Package | From | To |
| --- | --- | --- |
| [actions/checkout](https://github.com/actions/checkout) | `4.1.7` | `5.0.0` |
| [actions/download-artifact](https://github.com/actions/download-artifact) | `5.0.0` | `6.0.0` |
| [actions/upload-artifact](https://github.com/actions/upload-artifact) | `4` | `5` |
| [KyleMayes/install-llvm-action](https://github.com/kylemayes/install-llvm-action) | `2.0.7` | `2.0.8` |
| [azure/trusted-signing-action](https://github.com/azure/trusted-signing-action) | `0.5.9` | `0.5.10` |
| [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) | `2.6.1` | `2.7.0` |
| [softprops/action-gh-release](https://github.com/softprops/action-gh-release) | `2.3.3` | `2.4.1` |
| [github/codeql-action](https://github.com/github/codeql-action) | `3.30.5` | `4.31.2` |
An image size was corrupted and decoded as 0 resulting in a NULL image buffer,
which caused a NULL pointer dereference when the image being copied to the buffer.
The invalid image size was caught in the PR HDFGroup#5710.  This change catches right
before the copying.

Fixes GH issue HDFGroup#5384
Added \since release version to constants in
H5VLnative.h
H5Fpublic.h

Fixes HDFGroup#4408 (part 5)
Adds predefined datatypes for FP8 data in E4M3 and E5M2 formats

Does not add support for any native FP8 types; datatype conversions are performed in software
lrknox and others added 10 commits November 8, 2025 14:46
* Add instruction to verify library version compliance with sematic
versioning with link to semantic versioning wiki page.
Update H5.c and version tests for move of major and minor versions to
1st and 2nd version numbers.

* WILL_FAIL for the tcheck_version doesn't need changing for release
branches - removed that instruction from RELEASE_PROCESS.
Change release version instructions to use x.y.z.1 for pre-release
instead of x.y.z-1 as for develop snapshots.
…Group#5965)

Various related changes, including refactoring part of dataset creation, and reworking how layout versions are calculated.

Needs more testing of filters that create very large chunks, but that will need code to turn it off in cases where we can't allocate >4GiB buffers. I tested this manually by increasing the expansion ration in the expand2 test in dsets, and it passed everything up until it tried to expand the datasets with H5Dset_extent() and my laptop ran out of disk space.

We should add code and testing to handle the case where the "size of size" is set less than 8 bytes. This is not a new issue, since it can be set to 2.

I will file issues for these, but I don't think they are necessary for the release.
…DFGroup#5924)

When two entries in the cache image have the same address, the library
did not fail and later crashed.

The cache reconstruction functions now detect duplicate addresses.  When
a failure occurs during the reconstruction, the cache is not clean properly.
H5C__reconstruct_cache_contents now expunges any prefetched entries that
were already added to the cache during the reconstruction.
FFM build requires Java 25, Jextract 25.
Generates FFM bindings during configure.
JNI is default when the requirements are not met or can be forced.
Presets added for maven and FFM - JNI is default selection.
Enhanced Maven options will work with either JNI or FFM
New Workflows for testing and maven uploads.
Extensive documentation changes for java.
@nbagha1 nbagha1 assigned fortnern and unassigned bmribler Nov 18, 2025
@nbagha1 nbagha1 moved this from To be triaged to In progress in HDF5 - TRIAGE & TRACK Nov 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

a heap-based buffer overflow vulnerability in the H5VM_memcpyvv