Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
da86ea2
Add AWS S3 support for py-key-value
github-actions[bot] Oct 29, 2025
1b1c3d7
fix: exclude S3 from sync library and improve test configuration
github-actions[bot] Oct 29, 2025
a8a093f
fix: address CodeRabbit review feedback for S3 store
github-actions[bot] Oct 29, 2025
401d947
fix: handle S3 key length limits by hashing long collection/key names
github-actions[bot] Oct 30, 2025
82d3f6a
refactor: update S3Store to use new serialization adapter pattern
github-actions[bot] Oct 31, 2025
d9c1d7b
feat: add length_is_bytes parameter to sanitize_string and hash_exces…
github-actions[bot] Nov 2, 2025
731afa1
fix: address review feedback for S3 store
github-actions[bot] Nov 2, 2025
d528933
fix: address CodeRabbit review feedback for S3 store
github-actions[bot] Nov 2, 2025
e0f4529
simplify setup/teardown in s3
strawgate Nov 2, 2025
065b551
refactor: align S3 store with sanitization strategy pattern
github-actions[bot] Nov 7, 2025
06228f0
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 7, 2025
dbab902
Longer wait for s3 spinup
strawgate Nov 7, 2025
e5125c2
Small test changes
strawgate Nov 7, 2025
f8b3a70
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 7, 2025
2626137
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 8, 2025
89ccb19
fix: resolve CodeRabbit feedback and failing tests
github-actions[bot] Nov 8, 2025
4a231c5
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 9, 2025
462188c
Merge main into claude/issue-161-20251029-0140
github-actions[bot] Nov 9, 2025
8618596
Update lockfile and generate sync code for filetree store
github-actions[bot] Nov 9, 2025
f2721ae
Exclude FileTreeStore from sync code generation
github-actions[bot] Nov 9, 2025
f32a8ce
Apply suggestion from @coderabbitai[bot]
strawgate Nov 9, 2025
b230647
Fix Elasticsearch index already exists exception check
github-actions[bot] Nov 9, 2025
5df62e3
Extend Elasticsearch cluster health check timeout from 10s to 15s
github-actions[bot] Nov 9, 2025
ab6c55d
Merge main into claude/issue-161-20251029-0140
github-actions[bot] Nov 9, 2025
f14500a
Remove excessive code comments from S3 store
github-actions[bot] Nov 10, 2025
2b7dfe5
Revert Elasticsearch changes and fix S3 store client lifecycle manage…
github-actions[bot] Nov 10, 2025
5245ebe
Fix S3 and DynamoDB client lifecycle management
github-actions[bot] Nov 10, 2025
87366fb
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 10, 2025
5624488
Merge branch 'main' into claude/issue-161-20251029-0140
strawgate Nov 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This monorepo contains two libraries:

## Why use this library?

- **Multiple backends**: DynamoDB, Elasticsearch, Memcached, MongoDB, Redis,
- **Multiple backends**: DynamoDB, S3, Elasticsearch, Memcached, MongoDB, Redis,
RocksDB, Valkey, and In-memory, Disk, etc
- **TTL support**: Automatic expiration handling across all store types
- **Type-safe**: Full type hints with Protocol-based interfaces
Expand Down Expand Up @@ -131,6 +131,7 @@ pip install py-key-value-aio
pip install py-key-value-aio[memory]
pip install py-key-value-aio[disk]
pip install py-key-value-aio[dynamodb]
pip install py-key-value-aio[s3]
pip install py-key-value-aio[elasticsearch]
# or: redis, mongodb, memcached, valkey, vault, registry, rocksdb, see below for all options
```
Expand Down Expand Up @@ -191,7 +192,7 @@ categories:
- **Local stores**: In-memory and disk-based storage (Memory, Disk, RocksDB, etc.)
- **Secret stores**: Secure OS-level storage for sensitive data (Keyring, Vault)
- **Distributed stores**: Network-based storage for multi-node apps (Redis,
DynamoDB, MongoDB, etc.)
DynamoDB, S3, MongoDB, etc.)

Each store has a **stability rating** indicating likelihood of
backwards-incompatible changes. Stable stores (Redis, Valkey, Disk, Keyring)
Expand Down
10 changes: 10 additions & 0 deletions docs/api/stores.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,16 @@ AWS DynamoDB-backed key-value store.
members:
- __init__

## S3 Store

AWS S3-backed key-value store.

::: key_value.aio.stores.s3.S3Store
options:
show_source: false
members:
- __init__

## Elasticsearch Store

Elasticsearch-backed key-value store.
Expand Down
37 changes: 37 additions & 0 deletions docs/stores.md
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,7 @@ Distributed stores provide network-based storage for multi-node applications.
| Store | Stability | Async | Sync | Description |
|-------|:---------:|:-----:|:----:|:------------|
| DynamoDB | Unstable | ✅ | ✖️ | AWS DynamoDB key-value storage |
| S3 | Unstable | ✅ | ✖️ | AWS S3 object storage |
| Elasticsearch | Unstable | ✅ | ✅ | Full-text search with key-value capabilities |
| Memcached | Unstable | ✅ | ✖️ | High-performance distributed memory cache |
| MongoDB | Unstable | ✅ | ✅ | Document database used as key-value store |
Expand Down Expand Up @@ -503,6 +504,42 @@ pip install py-key-value-aio[dynamodb]

---

### S3Store

AWS S3 object storage for durable, scalable key-value storage.

```python
from key_value.aio.stores.s3 import S3Store

store = S3Store(
bucket_name="my-kv-bucket",
region_name="us-east-1"
)
```

**Installation:**

```bash
pip install py-key-value-aio[s3]
```

**Use Cases:**

- Large value storage (up to 5TB per object)
- Durable, long-term storage
- Cost-effective archival
- Multi-region replication

**Characteristics:**

- 99.999999999% durability
- Automatic key sanitization for S3 path limits
- Supports lifecycle policies
- Pay-per-use pricing
- Stable storage format: **Unstable**

---

### ElasticsearchStore

Full-text search engine used as a key-value store.
Expand Down
3 changes: 2 additions & 1 deletion key-value/key-value-aio/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ vault = ["hvac>=2.3.0", "types-hvac>=2.3.0"]
memcached = ["aiomcache>=0.8.0"]
elasticsearch = ["elasticsearch>=8.0.0", "aiohttp>=3.12"]
dynamodb = ["aioboto3>=13.3.0", "types-aiobotocore-dynamodb>=2.16.0"]
s3 = ["aioboto3>=13.3.0", "types-aiobotocore-s3>=2.16.0"]
keyring = ["keyring>=25.6.0"]
keyring-linux = ["keyring>=25.6.0", "dbus-python>=1.4.0"]
pydantic = ["pydantic>=2.11.9"]
Expand Down Expand Up @@ -70,7 +71,7 @@ env_files = [".env"]

[dependency-groups]
dev = [
"py-key-value-aio[memory,disk,filetree,redis,elasticsearch,memcached,mongodb,vault,dynamodb,rocksdb,duckdb]",
"py-key-value-aio[memory,disk,filetree,redis,elasticsearch,memcached,mongodb,vault,dynamodb,s3,rocksdb,duckdb]",
"py-key-value-aio[valkey]; platform_system != 'Windows'",
"py-key-value-aio[keyring]",
"py-key-value-aio[pydantic]",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ class DynamoDBStore(BaseContextManagerStore, BaseStore):
_endpoint_url: str | None
_raw_client: Any # DynamoDB client from aioboto3
_client: DynamoDBClient | None
_owns_client: bool

@overload
def __init__(self, *, client: DynamoDBClient, table_name: str, default_collection: str | None = None) -> None:
Expand Down Expand Up @@ -101,6 +102,8 @@ def __init__(
self._table_name = table_name
if client:
self._client = client
self._raw_client = None
self._owns_client = False
else:
session: Session = aioboto3.Session(
region_name=region_name,
Expand All @@ -112,6 +115,7 @@ def __init__(
self._raw_client = session.client(service_name="dynamodb", endpoint_url=endpoint_url) # pyright: ignore[reportUnknownMemberType]

self._client = None
self._owns_client = True

super().__init__(default_collection=default_collection)

Expand All @@ -127,8 +131,8 @@ async def __aexit__(
self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback: TracebackType | None
) -> None:
await super().__aexit__(exc_type, exc_value, traceback)
if self._client:
await self._client.__aexit__(exc_type, exc_value, traceback)
if self._owns_client and self._raw_client:
await self._raw_client.__aexit__(exc_type, exc_value, traceback)
Comment on lines +134 to +135
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Potential double-exit of the raw client.

Both __aexit__ and _close contain identical cleanup logic that exits _raw_client. If BaseContextManagerStore.__aexit__ calls _close() (which is a common pattern), the raw client would be exited twice:

  1. First exit when super().__aexit__() calls _close() (line 133)
  2. Second exit at line 135

Calling __aexit__ twice on an aioboto3 client could cause errors or undefined behavior.

Verify whether BaseContextManagerStore.__aexit__ calls _close:

If the verification confirms that _close is called from the parent's __aexit__, apply this fix to move cleanup logic to only _close:

     async def __aexit__(
         self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback: TracebackType | None
     ) -> None:
         await super().__aexit__(exc_type, exc_value, traceback)
-        if self._owns_client and self._raw_client:
-            await self._raw_client.__aexit__(exc_type, exc_value, traceback)

Also applies to: 263-264


🏁 Script executed:

#!/bin/bash
# Check if BaseContextManagerStore.__aexit__ calls _close
ast-grep --pattern $'class BaseContextManagerStore($$$):
  $$$
  async def __aexit__($$$):
    $$$
    _close
    $$$'

Length of output: 0



---

🏁 Script executed:

```shell
#!/bin/bash
# Find where BaseContextManagerStore is defined
rg -n "class BaseContextManagerStore" --type=py

Length of output: 226


🏁 Script executed:

#!/bin/bash
# Search for __aexit__ in BaseContextManagerStore and _close pattern
rg -n -A 20 "class BaseContextManagerStore" --type=py | head -100

Length of output: 3980


🏁 Script executed:

#!/bin/bash
# Search for _close method to understand the cleanup pattern
rg -n "async def _close" --type=py -B 2 -A 10

Length of output: 8306



---

🏁 Script executed:

```shell
#!/bin/bash
# Examine the DynamoDB store __aexit__ method around lines 130-140
sed -n '130,140p' key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py

Length of output: 484


🏁 Script executed:

#!/bin/bash
# Get more context around __aexit__ in DynamoDB store
rg -n -B 5 -A 10 "async def __aexit__" key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py

Length of output: 681


Potential double-exit of the raw client is confirmed.

The raw client's __aexit__ is called twice:

  1. First when super().__aexit__() calls _close() (line 133 → base.py:437 → line 264)
  2. Then again directly at lines 134-135

Remove the duplicate cleanup from __aexit__ and keep it only in _close:

     async def __aexit__(
         self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback: TracebackType | None
     ) -> None:
         await super().__aexit__(exc_type, exc_value, traceback)
-        if self._owns_client and self._raw_client:
-            await self._raw_client.__aexit__(exc_type, exc_value, traceback)

Also applies to S3 store (key-value/key-value-aio/src/key_value/aio/stores/s3/store.py:444-447).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self._owns_client and self._raw_client:
await self._raw_client.__aexit__(exc_type, exc_value, traceback)
async def __aexit__(
self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback: TracebackType | None
) -> None:
await super().__aexit__(exc_type, exc_value, traceback)
🤖 Prompt for AI Agents
In key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py around
lines 134-135, the raw client's __aexit__ is being invoked twice (once via
super().__aexit__ -> _close() and again directly), so remove the direct await
self._raw_client.__aexit__(exc_type, exc_value, traceback) from the
DynamoDB.__aexit__ method and let _close() handle closing the raw client
(preserving the existing self._owns_client conditional inside _close); apply the
same fix to the S3 store at
key-value/key-value-aio/src/key_value/aio/stores/s3/store.py lines 444-447 to
avoid double-exiting the raw client.


@property
def _connected_client(self) -> DynamoDBClient:
Expand Down Expand Up @@ -256,5 +260,5 @@ async def _delete_managed_entry(self, *, key: str, collection: str) -> bool:
@override
async def _close(self) -> None:
"""Close the DynamoDB client."""
if self._client:
await self._client.__aexit__(None, None, None) # pyright: ignore[reportUnknownMemberType]
if self._owns_client and self._raw_client:
await self._raw_client.__aexit__(None, None, None) # pyright: ignore[reportUnknownMemberType]
13 changes: 13 additions & 0 deletions key-value/key-value-aio/src/key_value/aio/stores/s3/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""AWS S3-based key-value store."""

from key_value.aio.stores.s3.store import (
S3CollectionSanitizationStrategy,
S3KeySanitizationStrategy,
S3Store,
)

__all__ = [
"S3CollectionSanitizationStrategy",
"S3KeySanitizationStrategy",
"S3Store",
]
Loading
Loading