Skip to content

Commit bb413af

Browse files
committed
docs: improve tarfile documentation
1 parent 16857eb commit bb413af

File tree

1 file changed

+49
-22
lines changed

1 file changed

+49
-22
lines changed

docs/index.rst

Lines changed: 49 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1470,42 +1470,69 @@ Use with tarfile module
14701470

14711471
Python's `tarfile <https://docs.python.org/3/library/tarfile.html>`_ module supports arbitrary compression algorithms by providing a file object.
14721472

1473-
This code encapsulates a ``ZstdTarFile`` class using :py:class:`ZstdFile`, it can be used like `tarfile.TarFile <https://docs.python.org/3/library/tarfile.html#tarfile.TarFile>`_ class:
1474-
14751473
.. sourcecode:: python
14761474

14771475
import tarfile
14781476

1479-
# when using read mode (decompression), the level_or_option parameter
1480-
# can only be a dict object, that represents decompression option. It
1481-
# doesn't support int type compression level in this case.
1477+
# compression
1478+
with ZstdFile('archive.tar.zst', mode='w') as _fileobj, tarfile.open(fileobj=_fileobj, mode='w') as tar:
1479+
# do something
1480+
1481+
# decompression
1482+
with ZstdFile('archive.tar.zst', mode='r') as _fileobj, tarfile.open(fileobj=_fileobj) as tar:
1483+
# do something
1484+
1485+
Alternatively, it is possible to extend the ``Tarfile`` class, except that it supports decompressing ``.tar.zst`` file automatically, as well as adding the following modes: ``r:zst``, ``w:zst`` and ``x:zst``.
1486+
1487+
.. sourcecode:: python
1488+
1489+
from tarfile import TarFile, CompressionError, ReadError
1490+
1491+
class CustomTarFile(TarFile):
1492+
1493+
OPEN_METH = {
1494+
**TarFile.OPEN_METH,
1495+
'zst': 'zstopen'
1496+
}
1497+
1498+
@classmethod
1499+
def zstopen(cls, name, mode='r', fileobj=None, level_or_option=None, zstd_dict=None, **kwargs):
1500+
"""Open zstd compressed tar archive name for reading or writing.
1501+
Appending is not allowed.
1502+
"""
1503+
if mode not in ('r', 'w', 'x'):
1504+
raise ValueError("mode must be 'r', 'w' or 'x'")
1505+
1506+
try:
1507+
from pyzstd import ZstdFile, ZstdError
1508+
except ImportError:
1509+
raise CompressionError("pyzstd module is not available") from None
1510+
1511+
fileobj = ZstdFile(fileobj or name, mode, level_or_option=level_or_option, zstd_dict=zstd_dict)
14821512

1483-
class ZstdTarFile(tarfile.TarFile):
1484-
def __init__(self, name, mode='r', *, level_or_option=None, zstd_dict=None, **kwargs):
1485-
self.zstd_file = ZstdFile(name, mode,
1486-
level_or_option=level_or_option,
1487-
zstd_dict=zstd_dict)
14881513
try:
1489-
super().__init__(fileobj=self.zstd_file, mode=mode, **kwargs)
1514+
tar = cls.taropen(name, mode, fileobj, **kwargs)
1515+
except (ZstdError, EOFError) as exception:
1516+
fileobj.close()
1517+
if mode == 'r':
1518+
raise ReadError('not a zstd file') from exception
1519+
raise
14901520
except:
1491-
self.zstd_file.close()
1521+
fileobj.close()
14921522
raise
14931523
1494-
def close(self):
1495-
try:
1496-
super().close()
1497-
finally:
1498-
self.zstd_file.close()
1524+
tar._extfileobj = False
1525+
return tar
14991526

1500-
# write .tar.zst file (compression)
1501-
with ZstdTarFile('archive.tar.zst', mode='w', level_or_option=5) as tar:
1527+
# compression
1528+
with CustomTarFile.open('archive.tar.zst', mode='w:zst') as tar:
15021529
# do something
15031530

1504-
# read .tar.zst file (decompression)
1505-
with ZstdTarFile('archive.tar.zst', mode='r') as tar:
1531+
# decompression
1532+
with CustomTarFile.open('archive.tar.zst') as tar:
15061533
# do something
15071534

1508-
When the above code is in read mode (decompression), and selectively read files multiple times, it may seek to a position before the current position, then the decompression has to be restarted from zero. If this slows down the operations, you can:
1535+
In both implementations, when selectively reading files multiple times, it may seek to a position before the current position; then the decompression has to be restarted from zero. If this slows down the operations, you can:
15091536

15101537
#. Use :py:class:`SeekableZstdFile` class to create/read .tar.zst file.
15111538
#. Decompress the archive to a temporary file, and read from it. This code encapsulates the process:

0 commit comments

Comments
 (0)