Allow caching LayoutReader in VortexFile#8244
Conversation
204794d to
a9411ec
Compare
a9411ec to
81ca290
Compare
Merging this PR will degrade performance by 20.42%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
31.7 µs | 46.8 µs | -32.12% |
| ❌ | Simulation | chunked_varbinview_canonical_into[(1000, 10)] |
161.8 µs | 198 µs | -18.3% |
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
176.8 µs | 213.2 µs | -17.07% |
| ❌ | Simulation | encode_varbin[(1000, 2)] |
143 µs | 164 µs | -12.78% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing myrrc/random-access-layout-reader (cba659b) with develop (d2d2e29)
3e959d2 to
96e9c3c
Compare
Polar Signals Profiling ResultsLatest Run
Previous Runs (2)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.994x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (0.994x ➖, 1↑ 0↓)
No file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.997x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.992x ➖, 0↑ 0↓)
datafusion / parquet (1.016x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.001x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.034x ➖, 0↑ 1↓)
duckdb / parquet (1.030x ➖, 0↑ 1↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.185x ❌, 0↑ 21↓)
datafusion / vortex-compact (1.168x ❌, 0↑ 22↓)
datafusion / parquet (1.118x ❌, 0↑ 14↓)
datafusion / arrow (1.291x ❌, 0↑ 21↓)
duckdb / vortex-file-compressed (1.170x ❌, 0↑ 21↓)
duckdb / vortex-compact (1.158x ❌, 0↑ 21↓)
duckdb / parquet (1.085x ➖, 0↑ 11↓)
duckdb / duckdb (1.121x ❌, 0↑ 14↓)
File Size Changes (10 files changed, -0.0% overall, 5↑ 5↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.991x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.990x ➖, 2↑ 4↓)
datafusion / parquet (0.988x ➖, 2↑ 2↓)
duckdb / vortex-file-compressed (0.990x ➖, 0↑ 1↓)
duckdb / vortex-compact (0.990x ➖, 1↑ 1↓)
duckdb / parquet (0.998x ➖, 1↑ 1↓)
duckdb / duckdb (0.998x ➖, 0↑ 1↓)
File Size Changes (6 files changed, -0.0% overall, 3↑ 3↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.035x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.149x ➖, 0↑ 2↓)
datafusion / parquet (1.331x ❌, 0↑ 5↓)
duckdb / vortex-file-compressed (1.109x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.051x ➖, 0↑ 0↓)
duckdb / parquet (1.055x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.003x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.006x ➖, 0↑ 0↓)
duckdb / parquet (1.015x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 0.909x ➖ How to read Verdict and Engines
unknown / unknown (0.988x ➖, 11↑ 2↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.994x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.991x ➖, 0↑ 0↓)
datafusion / parquet (0.989x ➖, 0↑ 0↓)
datafusion / arrow (0.991x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.988x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.991x ➖, 0↑ 0↓)
duckdb / parquet (0.991x ➖, 0↑ 0↓)
duckdb / duckdb (0.994x ➖, 0↑ 0↓)
File Size Changes (26 files changed, +0.0% overall, 17↑ 9↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.993x ➖, 2↑ 1↓)
datafusion / parquet (0.993x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (1.009x ➖, 1↑ 2↓)
duckdb / parquet (0.990x ➖, 2↑ 0↓)
duckdb / duckdb (1.005x ➖, 1↑ 2↓)
File Size Changes (106 files changed, +0.0% overall, 56↑ 50↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.224x ➖, 0↑ 7↓)
datafusion / vortex-compact (1.233x ➖, 0↑ 8↓)
datafusion / parquet (1.126x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.125x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.151x ➖, 0↑ 3↓)
duckdb / parquet (1.060x ➖, 0↑ 0↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.007x ➖, 0↑ 0↓)
datafusion / parquet (0.997x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.999x ➖, 0↑ 0↓)
duckdb / parquet (0.996x ➖, 0↑ 0↓)
duckdb / duckdb (0.995x ➖, 0↑ 0↓)
File Size Changes (4 files changed, -0.0% overall, 1↑ 3↓)
Totals:
|
Benchmarks: CompressionVortex (geomean): 0.995x ➖ How to read Verdict and Engines
unknown / unknown (1.011x ➖, 1↑ 2↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.141x ➖, 0↑ 4↓)
datafusion / vortex-compact (1.165x ➖, 0↑ 5↓)
datafusion / parquet (1.163x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.059x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.088x ➖, 0↑ 1↓)
duckdb / parquet (1.123x ➖, 0↑ 2↓)
|
robert3005
left a comment
There was a problem hiding this comment.
I think you want to make it a oncelock on VortexFile
96e9c3c to
4c4e6c4
Compare
4c4e6c4 to
7fafb90
Compare
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
7fafb90 to
cba659b
Compare
| pub fn new( | ||
| footer: Footer, | ||
| segment_source: Arc<dyn SegmentSource>, | ||
| session: VortexSession, | ||
| ) -> Self { | ||
| Self { | ||
| ) -> VortexResult<Self> { | ||
| Ok(Self { | ||
| footer, | ||
| segment_source, | ||
| session, | ||
| } | ||
| layout_reader_cache: None, | ||
| }) | ||
| } | ||
|
|
||
| /// Create a new VortexFile with layout reader caching | ||
| pub fn new_caching( | ||
| footer: Footer, | ||
| segment_source: Arc<dyn SegmentSource>, | ||
| session: VortexSession, | ||
| ) -> VortexResult<Self> { | ||
| Ok(Self { | ||
| footer, | ||
| segment_source, | ||
| session, | ||
| layout_reader_cache: Some(OnceLock::new()), | ||
| }) |
There was a problem hiding this comment.
new(..).with_caching()?
Not-reinitializing LayoutReader saves us some time.
As a related change, make root() static throughout the program. Before ScanBuilder::new() created a new root()
which didn't compare with the old one pointer-eq-wise which was an issue