|
| 1 | +# Obspec Utils |
| 2 | + |
| 3 | +Utilities for interacting with object storage, based on [obspec](https://github.com/developmentseed/obspec). |
| 4 | + |
| 5 | +## Background |
| 6 | + |
| 7 | +`obspec-utils` provides helpful utilities for working with object storage in Python, built on top of obspec and obstore. The library includes: |
| 8 | + |
| 9 | +1. **ObjectStoreRegistry**: A registry for managing multiple object stores, allowing you to resolve URLs to the appropriate store and path. This is particularly useful when working with datasets that span multiple storage backends or buckets. |
| 10 | + |
| 11 | +2. **File Handlers**: Wrappers around obstore's file reading capabilities that provide a familiar file-like interface, making it easy to integrate with libraries that expect standard Python file objects. |
| 12 | + |
| 13 | +## Getting started |
| 14 | + |
| 15 | +The library can be installed from PyPI: |
| 16 | + |
| 17 | +```bash |
| 18 | +python -m pip install obspec-utils |
| 19 | +``` |
| 20 | + |
| 21 | +## Usage |
| 22 | + |
| 23 | +### ObjectStoreRegistry |
| 24 | + |
| 25 | +The `ObjectStoreRegistry` allows you to register object stores and resolve URLs to the appropriate store: |
| 26 | + |
| 27 | +```python |
| 28 | +from obstore.store import S3Store |
| 29 | +from obspec_utils import ObjectStoreRegistry |
| 30 | + |
| 31 | +# Create and register stores |
| 32 | +s3store = S3Store(bucket="my-bucket", prefix="my-data/") |
| 33 | +registry = ObjectStoreRegistry({"s3://my-bucket": s3store}) |
| 34 | + |
| 35 | +# Resolve a URL to get the store and path |
| 36 | +store, path = registry.resolve("s3://my-bucket/my-data/file.nc") |
| 37 | +# path == "file.nc" |
| 38 | +``` |
| 39 | + |
| 40 | +### File Handlers |
| 41 | + |
| 42 | +The file handlers provide file-like interfaces for reading from object stores: |
| 43 | + |
| 44 | +```python |
| 45 | +from obstore.store import S3Store |
| 46 | +from obspec_utils import ObstoreReader, ObstoreMemCacheReader |
| 47 | + |
| 48 | +store = S3Store(bucket="my-bucket") |
| 49 | + |
| 50 | +# Standard reader with buffered reads |
| 51 | +reader = ObstoreReader(store, "path/to/file.bin", buffer_size=1024*1024) |
| 52 | +data = reader.read(100) # Read 100 bytes |
| 53 | + |
| 54 | +# Memory-cached reader for repeated access |
| 55 | +cached_reader = ObstoreMemCacheReader(store, "path/to/file.bin") |
| 56 | +data = cached_reader.readall() # Read entire file from memory cache |
| 57 | +``` |
| 58 | + |
| 59 | +## Contributing |
| 60 | + |
| 61 | +1. Clone the repository: `git clone https://github.com/virtual-zarr/obspec-utils.git` |
| 62 | +2. Install development dependencies: `uv sync --all-groups` |
| 63 | +3. Run the test suite: `uv run --all-groups pytest` |
| 64 | + |
| 65 | +## License |
| 66 | + |
| 67 | +`obspec-utils` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license. |
0 commit comments