Skip to content

Add human readable size for No. bytes stored to info_complete #3190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/3190.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add human readable size for No. bytes stored to `info_complete`
9 changes: 6 additions & 3 deletions docs/user-guide/arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ prints additional diagnostics, e.g.::
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=3, shuffle=<BloscShuffle.bitshuffle: 'bitshuffle'>, blocksize=0),)
No. bytes : 400000000 (381.5M)
No. bytes stored : 3558573
No. bytes stored : 3558573 (3.4M)
Storage ratio : 112.4
Chunks Initialized : 100

Expand Down Expand Up @@ -286,7 +286,7 @@ Here is an example using a delta filter with the Blosc compressor::
>>> compressors = zarr.codecs.BloscCodec(cname='zstd', clevel=1, shuffle=zarr.codecs.BloscShuffle.shuffle)
>>> data = np.arange(100000000, dtype='int32').reshape(10000, 10000)
>>> z = zarr.create_array(store='data/example-9.zarr', shape=data.shape, dtype=data.dtype, chunks=(1000, 1000), filters=filters, compressors=compressors)
>>> z.info
>>> z.info_complete()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the only real change to docs. Seems like if you are explaining about compression it's useful to show this extra information.

Type : Array
Zarr format : 3
Data type : Int32(endianness='little')
Expand All @@ -300,6 +300,9 @@ Here is an example using a delta filter with the Blosc compressor::
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (BloscCodec(typesize=4, cname=<BloscCname.zstd: 'zstd'>, clevel=1, shuffle=<BloscShuffle.shuffle: 'shuffle'>, blocksize=0),)
No. bytes : 400000000 (381.5M)
No. bytes stored : 826
Storage ratio : 484261.5
Chunks Initialized : 0

For more information about available filter codecs, see the `Numcodecs
<https://numcodecs.readthedocs.io/>`_ documentation.
Expand Down Expand Up @@ -616,7 +619,7 @@ Sharded arrays can be created by providing the ``shards`` parameter to :func:`za
Serializer : BytesCodec(endian=None)
Compressors : (ZstdCodec(level=0, checksum=False),)
No. bytes : 100000000 (95.4M)
No. bytes stored : 3981473
No. bytes stored : 3981473 (3.8M)
Storage ratio : 25.1
Shards Initialized : 100

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/groups.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ property. E.g.::
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (ZstdCodec(level=0, checksum=False),)
No. bytes : 8000000 (7.6M)
No. bytes stored : 1614
No. bytes stored : 1614 (1.6K)
Storage ratio : 4956.6
Chunks Initialized : 10
>>> baz.info
Expand Down
4 changes: 2 additions & 2 deletions docs/user-guide/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ ratios, depending on the correlation structure within the data. E.g.::
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (ZstdCodec(level=0, checksum=False),)
No. bytes : 400000000 (381.5M)
No. bytes stored : 342588911
No. bytes stored : 342588911 (326.7M)
Storage ratio : 1.2
Chunks Initialized : 100
>>> with zarr.config.set({'array.order': 'F'}):
Expand All @@ -153,7 +153,7 @@ ratios, depending on the correlation structure within the data. E.g.::
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (ZstdCodec(level=0, checksum=False),)
No. bytes : 400000000 (381.5M)
No. bytes stored : 342588911
No. bytes stored : 342588911 (326.7M)
Storage ratio : 1.2
Chunks Initialized : 100

Expand Down
2 changes: 1 addition & 1 deletion src/zarr/core/_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ def __repr__(self) -> str:

if self._count_bytes_stored is not None:
template += "\nNo. bytes stored : {_count_bytes_stored}"
kwargs["_count_stored"] = byte_info(self._count_bytes_stored)
kwargs["_count_bytes_stored"] = byte_info(self._count_bytes_stored)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we got bitten by kwargs here :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for posterity, if we really want to use kwargs (and I think we should not), we should define some typeddicts to model what goes in kwargs. something for a later pr.


if (
self._count_bytes is not None
Expand Down
4 changes: 2 additions & 2 deletions tests/test_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def test_array_info(zarr_format: ZarrFormat) -> None:


@pytest.mark.parametrize("zarr_format", ZARR_FORMATS)
@pytest.mark.parametrize("bytes_things", [(1_000_000, "976.6K", 500_000, "500000", "2.0", 5)])
@pytest.mark.parametrize("bytes_things", [(1_000_000, "976.6K", 500_000, "488.3K", "2.0", 5)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i love "bytes_things"

def test_array_info_complete(
zarr_format: ZarrFormat, bytes_things: tuple[int, str, int, str, str, int]
) -> None:
Expand Down Expand Up @@ -120,7 +120,7 @@ def test_array_info_complete(
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : ()
No. bytes : {count_bytes} ({count_bytes_formatted})
No. bytes stored : {count_bytes_stored_formatted}
No. bytes stored : {count_bytes_stored} ({count_bytes_stored_formatted})
Storage ratio : {storage_ratio_formatted}
Chunks Initialized : 5""")

Expand Down