feat: Pytest benchmark for comparing against other engines locally#10
feat: Pytest benchmark for comparing against other engines locally#10jiayuasu merged 9 commits intoapache:mainfrom
Conversation
|
Just pushing this somewhere for now. There are other benchmark library options too, so we don't need to commit to this one. I just found this very easy to setup and use. |
paleolimbot
left a comment
There was a problem hiding this comment.
Thanks!
Just a few suggestions to get started. The real work here is writing the actual queries and I'm happy to run them however works for you!
|
|
||
| # Setup tables | ||
| num_rows = 10000 | ||
| create_points_query = f"CREATE TABLE points AS SELECT ST_GeomFromText('POINT(0 0)') AS geom FROM range({num_rows})" |
There was a problem hiding this comment.
The DBEngine subclass has this abstracted already such that you can create a table from a GeoParquet file or GeoPandas data frame. You can use the geoarrow_data fixture to write benchmarks against actual data, or you can use the sd_random_geometry() table function to generate it (Kristin's join integration tests are a great example).
Probably synthetic data makes sense here: points, segments (linestrings with a vertex count of 2), polygon, complex_linestring, complex_polygon. The number of batches could be configurable so that you can run tiny benchmarks or big benchmarks (this is what we do in Rust, too).
There was a problem hiding this comment.
How exactly would you want to quanity complex vs non-complex?
There was a problem hiding this comment.
It looks like you found vertices_per_linestring_range. I use the numbers 10 ("simple") and 500 ("complex") in the Rust benchmarks, which is sort of arbitrary but did the trick of weeding out predicate implementations that weren't using a prepared geometry (particularly when one side was a scalar). Totally optional!
|
Awesome!
Not that far! The tests are all about correctness and corner cases...here we can stick to the most common cases. Most functions shouldn't need more than one or two benchmarks (one on a simple geometry, which is a benchmark of our per-geometry overhead, and one for complex geometry, which is more a test of the underlying implementation).
No need to make it configurable now, but maybe rename the tables to |
c58c4db to
3057d84
Compare
paleolimbot
left a comment
There was a problem hiding this comment.
This is a great start...thank you!
Can you add benchmarks/README.md (with the license header in a comment because Apache) with a brief description of the benchmarks and how to run them?
|
Does this mean all the performance numbers we saw yesterday are wrong? |
I don't debate the diagnostics here, but I would be surprised if DuckDB's Python package was configured to use one thread by default always, and that this wasn't caught for the entire lifecycle of the 1.3 release. There are a number of things we need to consider on top of yesterday's numbers including this! |
|
very good catch 😬 |




See the new
benchmarks/README.mdfor how to run and what the output looks like.