Conversation
|
Not specific feedback for the RFC, just sharing some context links and previous discussion that was helpful in getting the apache/datafusion#16779 Also tagging @ggershinsky in case he has any cycles to read this. His guidance was instrumental on the DataFusion and Arrow-rs PME work, as well as Iceberg Java encryption implementation. |
|
Thanks, I'll be glad to have a look. |
docs/rfcs/0003_table_encryption.md
Outdated
| ``` | ||
| Master Key (in KMS) | ||
| └── wraps → KEK (Key Encryption Key) — stored in table metadata as EncryptedKey | ||
| └── wraps → DEK (Data Encryption Key) — stored in StandardKeyMetadata per file |
There was a problem hiding this comment.
some DEKs (those for manifest list files) are also stored in table metadata as EncryptedKey. These DEKs are indeed packaged in a StandardKeyMetadata (along with AAD prefix and file length). The serialized StandardKeyMetadata is encrypted/wrapped by the KEK, and stored in the table metadata / encrypted_keys structure.
The manifest file DEKs are packaged in StandardKeyMetadata, and stored as-is (without encryption) in manifest list files. The latter are encrypted then.
The data file DEKs are packaged in StandardKeyMetadata, and stored as-is (without encryption) in manifest files. The latter are encrypted then.
docs/rfcs/0003_table_encryption.md
Outdated
|
|
||
| - **Master keys** live in the KMS and never leave it | ||
| - **KEKs** are wrapped by the master key and stored in `TableMetadata.encryption_keys` | ||
| - **DEKs** are wrapped by a KEK and stored per-file in `StandardKeyMetadata` |
There was a problem hiding this comment.
Only manifest list DEKs are wrapped by a KEK. Other DEKs are encrypted in the parent files, by the parent DEKs
| │ | ||
| ▼ | ||
| load_manifest_list(file_io, table_metadata) | ||
| 1. Look up encryption_key_id in table_metadata.encryption_keys |
There was a problem hiding this comment.
Also need to unwrap the KEK (via a KMS client)
docs/rfcs/0003_table_encryption.md
Outdated
| a. file_io.new_encrypted_output(path) → AGS1-encrypting OutputFile | ||
| b. em.wrap_key_metadata() → EncryptedKey for table metadata | ||
| c. Store key_id on Snapshot.encryption_key_id | ||
| 3. Table updates include AddEncryptionKey for new KEKs |
There was a problem hiding this comment.
Also need to wrap the KEK (via a KMS client)
|
Thanks for taking a look @ggershinsky I've tried to fill in some of the details here |
Which issue does this PR close?
RFC for table encryption
Part of: #2034
Rough draft with some of the key parts: #2042
What changes are included in this PR?
Are these changes tested?