Skip to content

WIP: Document current approach to permissions #3148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions docs/developers/store_permissions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Permissions in Zarr Python

Write/delete permissions in Zarr Python are confusing. This document aims to describe how they currently work.

# Data models

Zarr Python has two data models (`Array` and `Group`) and one storage model (`Store`). Only the store has a concept of write/delete permissions. Both `write` and `delete` permissions on `Store` instances are controlled by the `read_only` immutable property. Permissions on `Store` classes (i.e., implementations) are also influenced by the `supports_writes` and `supports_deletes` property, which should be the same for all instances of a class.

`Array` and `Group` do not have any permissions, instead they store a reference to a store that has permissions.

## Store properties related to permissions

### Instance properties

#### `read_only`

The `read_only` property indicates whether store *instances* should allow `set`, `delete` operations and their permutations. If `read_only` is `True`, then the store should reject any write or delete operations. If `read_only` is `False`, then the store should allow write and delete operations. The property is tested by `Store` methods by calling `self._check_writable()`, which raises a `ValueError` if the store's `read_only` property is true.

The `read_only` property is one of the most likely places to encounter a bug for a few reasons:

- `Store` implementations must remember to call `super.__init__(read_only=read_only)` in their `__init__` method to set the `read_only` property correctly.
- `Store` implementations must remember to call `self._check_writable()` in their `set` and `delete` methods to enforce the `read_only` property.
- `Array` and `Group` classes must remember to check alignment with the `read_only` property of the store with any `overwrite` arguments.
- Top level API functions must remember to check the `read_only` property of the store when creating new arrays or groups. This is complicated by the API functions using "mode" semantics like "w", "r", "a", etc., which are not directly related to the `read_only` property. Each function typically has its own logic for matching mode semntics to the `read_only` property of the store.

This is one of the most likely place to encounter a bug where a `read_only` property is not respected because implementations must remember to call `self._check_writable()` when implementing `store.set()`, which is not implemented in the `Store` abstract base class.

The Zarr spec does not seem to define how APIs should constrain write/delete permissions at the instance level.

### Class properties

The Zarr spec provides distinctions between readable, writeable, and listable stores, but does not define how to distinguish between these groups of store operations. The Zarr Python library has adopted the following properties to distinguish between these groups of operations at the *class* level, which are used by the `Store` abstract base class and the testing framework.

#### `supports_writes`

This is a property of the *class* that should indicate whether the store implements the following methods:

- `async def set(self, key: str, value: Buffer) -> None:`
- `async def set_if_not_exists(self, key: str, value: Buffer) -> None:`

`supports_writes` is primarily used by tests to determine the expected result of write operations. It is not used by the library to enforce permissions.

#### `supports_partial_writes`

The purpose of this property of the *class* is currently ambiguous.

One interpretation is that it indicates whether the store implements the following methods:

- `async def set_partial_values(self, key_start_values: Iterable[tuple[str, int, BytesLike]]) -> None:

But the `FsspecStore` class does not implement this method, but it does have `supports_partial_writes = True`.

Another interpretation is that it indicates whether the store supports a `byte_range` argument in the `set` method.

#### `supports_deletes`

This is a property of the *class* that should indicate whether the store implements the following methods:

- `async def delete(self, key: str) -> None:`

The `supports_deletes` property is used by the `Store` abstract base class to determine whether it should delete all keys under a given prefix using `store.delete_dir(prefix)`.

The `supports_deletes` property is used by the `Array` and `Group` classes before calling `store.delete_dir(prefix)` to determine whether the store supports deleting keys when those data classes are opened with `overwrite=True`. If a store does not support deletes, the `Array` and `Group` classes check if an array or group is identified in that location via a `.zarray`, `.zgroup`, or `zarr.json` key. If such a key exists, the `Array` or `Group` will raise an error without trying to delete it. If a store does support deletes, the `Array` and `Group` classes will attempt to recursively delete the keys in the store using `store.delete_dir(prefix)`.

The `supports_deletes` property is also used by the testing framework to determine the expected result of delete operations.

!!! note
Store implementations are agnostic to the Zarr data model. They will delete everything under the given prefix, regardless of whether it is an array, group, or unrelated to the Zarr store.

#### `supports_listing`

This is a property of the *class* that should indicate whether the store implements the following method:

- `async def list(self, prefix: str = '', delimiter: str = '') -> List[str]:`

This used to determine whether the `Store` abstract base classes `is_empty`, `clear`, and `delete_dir`methods should raise a `NotImplementedError`.

The `supports_listing` property is also used by the testing framework to determine the expected result of list operations.