Skip to content

Conversation

@dantengsky
Copy link
Member

@dantengsky dantengsky commented Nov 21, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  • Fuse-table parquet deserialization now loads decimal columns with precision ≤18 as Decimal64 instead of Decimal128.

    Parquet already picks the physical type based on precision during serialization, so only the read path needed to change. This remains fully backward compatible and safe for rolling upgrades.

  • Fuse-table native serialization and deserialization now store Decimal64 columns as true 64-bit values per row instead of widening them to 128 bits.

    This change is not backward compatible: tables produced before this PR cannot be read by a build that includes this PR and will fail during Decimal64 decoding.

    But given the clear performance gain and the native format still being in the research phase, accepting this compatibility trade-off seems reasonable.

  • To keep rolling upgrades viable, we leave the table-column metadata protobuf in meta untouched even though it still lacks Decimal64; changing it would break older nodes in mixed-version deployments.

    Technically we could push part of the Decimal conversion logic into schema_from_to_protobuf_impl.rs, but that would force edits to existing proto_conv unit tests which would break that module’s append-only testing style and feels like a poor trade-off.

Performance

  • Environment: r7i.8xlarge (single node), disk cache on, TPC-H SF1000 lineitem fully cached on disk.
  • Builds: This PR vs v1.2.848-nightly.
  • Queries
    • q1: select l_quantity from lineitem ignore_result
    • q2: select l_extendedprice from lineitem ignore_result

Three hot runs per query/version (all values in seconds).

Query Version Run 1 Run 2 Run 3 Avg
q1 This PR 1.04 1.02 1.00 1.02
v1.2.848-nightly 3.28 3.47 3.32 3.36
q2 This PR 4.43 4.37 4.38 4.39
v1.2.848-nightly 6.29 6.23 6.29 6.27

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Nov 21, 2025
@dantengsky dantengsky changed the title feat: enable Decimal64 handling throughout the Fuse-table Parquet deserialization pipeline feat: enable Decimal64 handling in fuse table parquet deserialization Nov 21, 2025
@dantengsky dantengsky changed the title feat: enable Decimal64 handling in fuse table parquet deserialization feat: enable Decimal64 handling in fuse table deserialization Nov 24, 2025
@dantengsky dantengsky force-pushed the feat/enable-decimal64 branch from 045bd9a to 2bdea1a Compare November 26, 2025 02:28
@dantengsky dantengsky force-pushed the feat/enable-decimal64 branch from 2bdea1a to 9cfa359 Compare November 26, 2025 05:41
@dantengsky dantengsky marked this pull request as ready for review November 26, 2025 06:40
@dantengsky
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@dantengsky dantengsky merged commit bc5b593 into databendlabs:main Nov 26, 2025
171 of 173 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable Decimal64 handling throughout the Fuse-table Parquet deserialization pipeline.

2 participants