Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/contributor-guide/howtos.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,4 +187,4 @@ valid installation of [protoc] (see [installation instructions] for details).
```

[protoc]: https://github.com/protocolbuffers/protobuf#protocol-compiler-installation
[installation instructions]: https://datafusion.apache.org/contributor-guide/getting_started.html#protoc-installation
[installation instructions]: https://datafusion.apache.org/contributor-guide/development_environment.html#protoc-installation
2 changes: 1 addition & 1 deletion docs/source/library-user-guide/extending-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ SELECT * FROM sales
[`executionplan`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
[`sessioncontext`]: https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html
[`sessionstatebuilder`]: https://docs.rs/datafusion/latest/datafusion/execution/session_state/struct.SessionStateBuilder.html
[`relationplannercontext`]: https://docs.rs/datafusion/latest/datafusion/sql/planner/trait.RelationPlannerContext.html
[`relationplannercontext`]: https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.RelationPlannerContext.html
[exprplanner api documentation]: https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.ExprPlanner.html
[typeplanner api documentation]: https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.TypePlanner.html
[relationplanner api documentation]: https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.RelationPlanner.html
Expand Down
1 change: 0 additions & 1 deletion docs/source/library-user-guide/functions/adding-udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,6 @@ For async UDF implementation details, see [`async_udf.rs`](https://github.com/ap

[`scalarudf`]: https://docs.rs/datafusion/latest/datafusion/logical_expr/struct.ScalarUDF.html
[`create_udf`]: https://docs.rs/datafusion/latest/datafusion/logical_expr/fn.create_udf.html
[`process_scalar_func_inputs`]: https://docs.rs/datafusion/latest/datafusion/physical_expr/functions/fn.process_scalar_func_inputs.html
[`advanced_udf.rs`]: https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/udf/advanced_udf.rs

## Named Arguments
Expand Down
6 changes: 3 additions & 3 deletions docs/source/library-user-guide/table-constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,6 @@ They are provided for informational purposes and can be used by custom
- **Foreign keys and check constraints**: These constraints are parsed
but are not validated or used during query planning.

[`tableconstraint`]: https://docs.rs/datafusion/latest/datafusion/sql/planner/enum.TableConstraint.html
[`constraints`]: https://docs.rs/datafusion/latest/datafusion/common/functional_dependencies/struct.Constraints.html
[`field`]: https://docs.rs/arrow/latest/arrow/datatype/struct.Field.html
[`tableconstraint`]: https://docs.rs/datafusion/latest/datafusion/logical_expr/sqlparser/ast/enum.TableConstraint.html
[`constraints`]: https://docs.rs/datafusion/latest/datafusion/common/struct.Constraints.html
[`field`]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
8 changes: 4 additions & 4 deletions docs/source/user-guide/arrow-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,8 +226,8 @@ When working with Arrow and RecordBatches, watch out for these common issues:
[`field`]: https://docs.rs/arrow-schema/latest/arrow_schema/struct.Field.html
[`schema`]: https://docs.rs/arrow-schema/latest/arrow_schema/struct.Schema.html
[`datatype`]: https://docs.rs/arrow-schema/latest/arrow_schema/enum.DataType.html
[`int32array`]: https://docs.rs/arrow-array/latest/arrow_array/array/struct.Int32Array.html
[`stringarray`]: https://docs.rs/arrow-array/latest/arrow_array/array/struct.StringArray.html
[`int32array`]: https://docs.rs/arrow/latest/arrow/array/type.Int32Array.html
[`stringarray`]: https://docs.rs/arrow/latest/arrow/array/type.StringArray.html
[`int32`]: https://docs.rs/arrow-schema/latest/arrow_schema/enum.DataType.html#variant.Int32
[`int64`]: https://docs.rs/arrow-schema/latest/arrow_schema/enum.DataType.html#variant.Int64
[extension points]: ../library-user-guide/extensions.md
Expand All @@ -241,8 +241,8 @@ When working with Arrow and RecordBatches, watch out for these common issues:
[`.show()`]: https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrame.html#method.show
[`memtable`]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.MemTable.html
[`sessioncontext`]: https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html
[`csvreadoptions`]: https://docs.rs/datafusion/latest/datafusion/execution/options/struct.CsvReadOptions.html
[`parquetreadoptions`]: https://docs.rs/datafusion/latest/datafusion/execution/options/struct.ParquetReadOptions.html
[`csvreadoptions`]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/options/struct.CsvReadOptions.html
[`parquetreadoptions`]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/options/struct.ParquetReadOptions.html
[`recordbatch`]: https://docs.rs/arrow-array/latest/arrow_array/struct.RecordBatch.html
[`read_csv`]: https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html#method.read_csv
[`read_parquet`]: https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html#method.read_parquet
Expand Down
8 changes: 3 additions & 5 deletions docs/source/user-guide/concepts-readings-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

## 🧭 Background Concepts

- **2024-06-13**: [2024 ACM SIGMOD International Conference on Management of Data: Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine](https://dl.acm.org/doi/10.1145/3626246.3653368) - [Download](http://andrew.nerdnetworks.org/other/SIGMOD-2024-lamb.pdf), [Talk](https://youtu.be/-DpKcPfnNms), [Slides](https://docs.google.com/presentation/d/1gqcxSNLGVwaqN0_yJtCbNm19-w5pqPuktII5_EDA6_k/edit#slide=id.p), [Recording ](https://youtu.be/-DpKcPfnNms)
- **2024-06-13**: [2024 ACM SIGMOD International Conference on Management of Data: Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine](https://dl.acm.org/doi/10.1145/3626246.3653368) - [Download](https://andrew.nerdnetworks.org/pdf/SIGMOD-2024-lamb.pdf), [Talk](https://youtu.be/-DpKcPfnNms), [Slides](https://docs.google.com/presentation/d/1gqcxSNLGVwaqN0_yJtCbNm19-w5pqPuktII5_EDA6_k/edit#slide=id.p), [Recording ](https://youtu.be/-DpKcPfnNms)

- **2024-06-07**: [Video: SIGMOD 2024 Practice: Apache Arrow DataFusion A Fast, Embeddable, Modular Analytic Query Engine](https://www.youtube.com/watch?v=-DpKcPfnNms&t=5s) - [Slides](https://docs.google.com/presentation/d/1gqcxSNLGVwaqN0_yJtCbNm19-w5pqPuktII5_EDA6_k/edit#slide=id.p)

Expand Down Expand Up @@ -87,16 +87,14 @@ This is a list of DataFusion related blog posts, articles, and other resources.

- **2024-10-29** [Video: MiDAS Seminar Fall 2024 on "Apache DataFusion" by Andrew Lamb](https://www.youtube.com/watch?v=CpnxuBwHbUc)

- **2024-10-27** [Blog: Caching in DataFusion: Don't read twice](https://blog.haoxp.xyz/posts/caching-datafusion)
- **2024-10-27** [Blog: Caching in DataFusion: Don't read twice](https://blog.xiangpeng.systems/posts/caching-datafusion/)

- **2024-10-24** [Blog: Parquet pruning in DataFusion: Read no more than you need](https://blog.haoxp.xyz/posts/parquet-to-arrow/)
- **2024-10-24** [Blog: Parquet pruning in DataFusion: Read no more than you need](https://blog.xiangpeng.systems/posts/parquet-to-arrow/)

- **2024-09-13** [Blog: Using StringView / German Style Strings to make Queries Faster: Part 2 - String Operations](https://www.influxdata.com/blog/faster-queries-with-stringview-part-two-influxdb/) | [Reposted on DataFusion Blog](https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-2/)

- **2024-09-13** [Blog: Using StringView / German Style Strings to Make Queries Faster: Part 1- Reading Parquet](https://www.influxdata.com/blog/faster-queries-with-stringview-part-one-influxdb/) | [Reposted on Datafusion Blog](https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-1/)

- **2024-10-16** [Blog: Candle Image Segmentation](https://www.letsql.com/posts/candle-image-segmentation/)

- **2024-09-23 → 2024-12-02** [Talks: Carnegie Mellon University: Database Building Blocks Seminar Series - Fall 2024](https://db.cs.cmu.edu/seminar2024/)

- **2024-11-12** [Video: Building InfluxDB 3.0 with the FDAP Stack: Apache Flight, DataFusion, Arrow and Parquet (Paul Dix)](https://www.youtube.com/watch?v=AGS4GNGDK_4)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user-guide/example-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ exported by DataFusion, for example:
use datafusion::arrow::datatypes::Schema;
```

For example, [DataFusion `25.0.0` dependencies] require `arrow`
For example, [DataFusion `26.0.0` dependencies] require `arrow`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While updating the DataFusion version to 26.0.0 is a good step, this change makes the following line about the required arrow version incorrect. According to crates.io, DataFusion 26.0.0 depends on arrow 40.0.0, not 39.0.0 as stated on the next line.

Please update the arrow versions on line 107 to maintain the example's accuracy. A correct statement would be:

For example, [DataFusion `26.0.0` dependencies] require `arrow`
`40.0.0`. If instead you used `arrow` `41.0.0` in your project you may
see errors such as:

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:documentation; feedback:The Gemini AI reviewer is correct! If the example needs to use DataFusion 26.0.0 then the version of its arrow dependency should be increased too. DataFusion 26.0.0 depends on 40.0.0, not on 39.0.0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example says DataFusion 26.0.0 requires arrow 39.0.0, but docs.rs for datafusion 26.0.0 lists arrow ^40.0.0. This mismatch could mislead users trying to align dependency versions.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:documentation; feedback:The Augment AI reviewer is correct! If the example needs to use DataFusion 26.0.0 then the version of its arrow dependency should be increased too. DataFusion 26.0.0 depends on 40.0.0, not on 39.0.0.

`39.0.0`. If instead you used `arrow` `40.0.0` in your project you may
see errors such as:

Expand Down