Skip to content

Support storage-credentials in REST catalog LoadTableResult#3042

Merged
kevinjqliu merged 3 commits intoapache:mainfrom
rcjverhoef:rcjverhoef/storage-creds-support
Mar 6, 2026
Merged

Support storage-credentials in REST catalog LoadTableResult#3042
kevinjqliu merged 3 commits intoapache:mainfrom
rcjverhoef:rcjverhoef/storage-creds-support

Conversation

@rcjverhoef
Copy link
Contributor

Rationale for this change

Vended credentials don't work with REST catalogs following latests specs. Per latest spec:

Clients must first check whether the respective credentials exist in the storage-credentials field before checking the config for credentials.

This is missing in pyiceberg. REST catalogs that don't expose a config field are hence not compatible.

Are these changes tested?

Yes, ran locally against my REST catalog implementation, added tests.

Are there any user-facing changes?

No API changes. REST catalogs that return vended credentials via storage-credentials will now work correctly.

@rcjverhoef rcjverhoef force-pushed the rcjverhoef/storage-creds-support branch from 2864ab7 to b2cd702 Compare February 13, 2026 14:47
Fokko pushed a commit that referenced this pull request Feb 15, 2026
# Rationale for this change

While reviewing some open PRs #3041, #3042, I noticed CI kept failing on
the Python 3.13 job in the hive tests. Turns out [CPython 3.13.12
](https://docs.python.org/3.13/whatsnew/changelog.html#id3)was just
released and included a change
python/cpython#142651 which made
`Mock.call_count` thread-safe by deriving it from `len(call_args_list)`.

This broke our hive test, which was resetting the counter with
`mock.call_count = 0` directly. This switches to use reset_mock(), which
properly clears all the internal call tracking state.

## Are these changes tested?

`make test` passes

## Are there any user-facing changes?

no
rcjverhoef and others added 3 commits February 23, 2026 20:43
The Iceberg REST spec's LoadTableResult includes a storage-credentials
field for vended credentials (prefix-scoped temporary STS tokens).
PyIceberg was only reading the config field and silently dropping
storage-credentials, so vended credentials never reached the FileIO.

Per the spec: "Clients must first check whether the respective
credentials exist in the storage-credentials field before checking
the config for credentials."

This adds:
- storage_credentials field to TableResponse
- Longest-prefix credential resolution (mirroring Java's S3FileIO)
- Merging resolved credentials into FileIO with highest precedence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rcjverhoef rcjverhoef force-pushed the rcjverhoef/storage-creds-support branch from b3dab54 to bc9bacb Compare February 23, 2026 19:46
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rcjverhoef for working on this, actually suprised that this isn't in yet 😆

metadata_location: str | None = Field(alias="metadata-location", default=None)
metadata: TableMetadata
config: Properties = Field(default_factory=dict)
storage_credentials: list[StorageCredential] = Field(alias="storage-credentials", default_factory=list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Fokko, would you be able to merge this one as well?

@kevinjqliu
Copy link
Contributor

Thanks for the PR! This lgtm, i want to test it against a catalog integration before merging

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for Iceberg REST catalog storage-credentials in LoadTableResult, ensuring vended credentials are preferred over config credentials per the latest REST spec.

Changes:

  • Extends TableResponse to parse storage-credentials from REST load table responses.
  • Adds resolution logic to pick the best-matching credential via longest-prefix match and merge it with existing IO properties.
  • Adds unit tests covering prefix resolution and load-table credential precedence behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
pyiceberg/catalog/rest/__init__.py Parses storage-credentials and applies the resolved credential config with higher precedence than config when constructing FileIO.
tests/catalog/test_rest.py Adds tests for longest-prefix credential selection and for load_table preferring storage-credentials over config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, i tested this locally with Polaris, which supports credential vending (using #3106)

I could not test the longest prefix match logic since polaris currently only supports returning a single storage credential. The logic matches java so lgtm

best_match: StorageCredential | None = None
for cred in storage_credentials:
if location.startswith(cred.prefix):
if best_match is None or len(cred.prefix) > len(best_match.prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinjqliu kevinjqliu merged commit ca807ae into apache:main Mar 6, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants