Skip to content

GH-531: Add parquet flatbuf schema#544

Open
alkis wants to merge 3 commits intoapache:masterfrom
alkis:patch-2
Open

GH-531: Add parquet flatbuf schema#544
alkis wants to merge 3 commits intoapache:masterfrom
alkis:patch-2

Conversation

@alkis
Copy link
Contributor

@alkis alkis commented Dec 12, 2025

Rationale for this change

Improve wide table support.

What changes are included in this PR?

Add parquet flatbuf schema.

Do these changes have PoC implementations?

apache/arrow#48431

@@ -0,0 +1,224 @@
namespace parquet.format3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just name this parquet.format for now?

// 1. Statistics are stored in integral types if their size is fixed, otherwise prefix + suffix
// 2. ColumnMetaData.encoding_stats are removed, they are replaced with
// ColumnMetaData.is_fully_dict_encoded.
// 3. RowGroups are limited to 2GB in size, so we can use int for sizes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and the item below are out of date (we are using long now) and can keep things absolute?

@@ -0,0 +1,224 @@
namespace parquet.format3;

// Optimization notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand this comment to be explicit about the relationship between this FBS and parquet.thrift.

// Note: Match the thrift enum values so that we can cast between them.
enum Encoding : byte {
PLAIN = 0,
// GROUP_VAR_INT = 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call out commented out entries as deprecated to make it clear why they are commented out?

GZIP = 2,
LZO = 3,
BROTLI = 4,
// LZ4 = 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on deprecation.

scale: int;
}
enum TimeUnit : byte {
MS = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please make these match parquet.thrift for names (Millisecond, Microsecond, Nanosecond)?

// Logical types.
///////////////////////////////////////////////////////////////////////////////////////////////////

table Empty {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want detailed docs (same level as parquet.thrift if we intend this to be the new footer)?

///////////////////////////////////////////////////////////////////////////////////////////////////

table Empty {}
table DecimalOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
table DecimalOpts {
table DecimalOptions {

Should be spell out type names to make it easer on reader?

// - BYTE_ARRAY:
// prefix: the longest common prefix of min/max
// lo8+hi8 zero padded 16 bytes (big-endian) of the suffix
// len: the length for the suffix of the value after removing the prefix. If > 16 then the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// len: the length for the suffix of the value after removing the prefix. If > 16 then the
// min_len/max_len: the length for the suffix of the value after removing the prefix. If > 16 then the

// prefix: the longest common prefix of min/max
// lo8+hi8 zero padded 16 bytes (big-endian) of the suffix
// len: the length for the suffix of the value after removing the prefix. If > 16 then the
// value is inexact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// value is inexact
// value is inexact (it is exact otherwise).

// - BOOLEAN: none
// - INT32/FLOAT: lo4 (little-endian)
// - INT64/DOUBLE: lo8 (little-endian)
// - INT96: lo4+lo8 (little-endian)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for composite values, I think this is complicated enough that providing concrete examples would be belpful for implementors?

DATA_PAGE_V2 = 3,
}

table KV {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
table KV {
table KeyValue {

Lets keep name consistent if possible?

codec: CompressionCodec;
num_values: long = null; // only present if not equal to rg.num_rows
total_uncompressed_size: long;
total_compressed_size: long;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to keep total unencoded size here which I think is generally useful? But I suppose it can be added after?

dictionary_page_offset: long = null;
statistics: Statistics;
is_fully_dict_encoded: bool;
bloom_filter_offset: long = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we this be made a struct/value to make the bloom filter info more self contained?

row_groups: [RowGroup];
kv: [KV];
created_by: string;
// column_orders: [ColumnOrder]; // moved to SchemaElement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this row for now?

Copy link
Contributor

@emkornfield emkornfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to add an apache header here, and CI to make sure this compiles?

@Jiayi-Wang-db
Copy link

Hi @rok and @emkornfield , could you help to have another look of this pr?

Comment on lines +25 to +26
// It can currently be attached as a footer extension, and may fully replace the
// Thrift footer in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should explicitly state thrift footer is still required as of now?


/** Optional Bloom filter information for this column chunk */
bloom_filter: BloomFilterInfo;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this proposal omits optional GeospatialStatistics which with bounding boxes (8 doubles) makes for nice pruning metadata and would likely be worth keeping.

17: optional GeospatialStatistics geospatial_statistics;

min_lo4: uint;
min_lo8: ulong;
min_hi8: ulong;
min_len: byte = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
min_len: byte = null;
min_len: int = null;

Original suffix lenght could exceed int8 range of byte type.

max_lo4: uint;
max_lo8: ulong;
max_hi8: ulong;
max_len: byte = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above:

Suggested change
max_len: byte = null;
max_len: int = null;

}

table Statistics {
null_count: int = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
null_count: int = null;
null_count: long = null;

To match row group size range.

Comment on lines +473 to +477

union ColumnCryptoMetadata {
EncryptionWithFooterKey:Empty,
EncryptionWithColumnKey:Empty,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used by readers to recover the column key from KMS services.

Suggested change
union ColumnCryptoMetadata {
EncryptionWithFooterKey:Empty,
EncryptionWithColumnKey:Empty,
}
table EncryptionWithColumnKey {
/** Column path in schema **/
path_in_schema: [string];
/** Retrieval metadata of column encryption key **/
key_metadata: [byte];
}
union ColumnCryptoMetadata {
EncryptionWithFooterKey:Empty,
EncryptionWithColumnKey:EncryptionWithColumnKey,
}


/** repetition of the field. The root of the schema does not have a repetition_type.
* All other nodes must have one */
repetition_type: FieldRepetitionType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow for root to not have repetition type. In thrift we have optional:

3: optional FieldRepetitionType repetition_type;

Suggested change
repetition_type: FieldRepetitionType;
repetition_type: FieldRepetitionType = null;

// - INT32/FLOAT: min_lo4/max_lo4 (little-endian, 4 bytes)
// - INT64/DOUBLE: min_lo8/max_lo8 (little-endian, 8 bytes)
// - INT96: lo4 contains the low 4 bytes, lo8 contains the high 8 bytes (little-endian, 12 bytes total)
// - FIXED_LEN_BYTE_ARRAY:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps note it's the same as for BYTE_ARRAY?

Suggested change
// - FIXED_LEN_BYTE_ARRAY:
// - FIXED_LEN_BYTE_ARRAY: Encoded the same way as BYTE_ARRAY below

max_lo8: ulong;
max_hi8: ulong;
max_len: byte = null;
prefix: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be binary?

Comment on lines +182 to +185
table DecimalOptions {
precision: int;
scale: int;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In thrift these are in reverse order. Any reason to change that here?

struct DecimalType {
1: required i32 scale
2: required i32 precision
}

Comment on lines +29 to +30
// 1. Statistics use fixed-width integral types when possible; otherwise they are
// encoded as prefix + suffix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SizeStatistics and Statistics.distinct_count are removed.

@rok
Copy link
Member

rok commented Mar 11, 2026

@Jiayi-Wang-db Feel free to resolve comments you feel were addressed to make this more readable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants