-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Use Display
formatting of DataType
:s in error messages
#17565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @emilk -- this is looking great. Once we get the CI to pass I think we can merge it in
Should be green now
|
3812ed5
to
6ef4be0
Compare
My personal favorite is |
I did, unfortunately …and others |
|
Reverted some error messages, and |
OMG IS GREEN |
Display
formatting of DataType
:s in error messagweDisplay
formatting of DataType
:s in error messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Emil!
gogogogo! |
Thank you very much @emilk and @timsaucer |
This is part of an attempt to improve the error reporting of `arrow-rs`, `datafusion`, and any other 3rd party crates. I believe that error messages should be as readable as possible. Aim for `rustc` more than `gcc`. Here's an example of how this PR improves some existing error messages: Before: > Casting from Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Interval(DayTime), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false) to Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Duration(Second), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, true) not supported After: > Casting from Map(Field { "entries": Struct(key Utf8, value nullable Interval(DayTime)) }, false) to Map(Field { "entries": Struct(key Utf8, value Duration(Second)) }, true) not supported # Which issue does this PR close? - Closes #7048 - Continues and closes #7051 - Continues #7469 - More improvements coming in #8291 - Sibling PR: apache/datafusion#17565 - Part of #8351 # Rationale for this change DataType:s are often shown in error messages. Making these error messages readable is _very important_. # What changes are included in this PR? ## ~Unify `Debug` and `Display`~ ~The `Display` and `Debug` of `DataType` are now the SAME.~ ~Why? Both are frequently used in error messages (both in arrow, and `datafusion`), and both benefit from being readable yet reversible.~ Reverted based on PR feedback. I will try to improve the `Debug` formatting in a future PR, with clever use of https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct ## Improve `Display` of lists Improve the `Display` formatting of * `DataType::List` * `DataType::LargeList` * `DataType::FixedSizeList` **Before**: `List(Field { name: \"item\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })` **After**: `List(nullable Int32)` **Before**: `FixedSizeList(Field { name: \"item\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, 5)` **After**: `FixedSizeList(5 x Int32)` ### Better formatting of `DataType::Struct` The formatting of `Struct` is now reversible, including nullability and metadata. - Continues #7469 ### ~Improve `Debug` format of `Field`~ ~Best understood with this diff for an existing test:~ <img width="1140" height="499" alt="Screenshot 2025-09-07 at 18 30 44" src="https://github.com/user-attachments/assets/794b4de9-8459-4ee7-82d2-8f5ae248614c" /> **EDIT**: reverted # Are these changes tested? Yes - new tests cover them # Are there any user-facing changes? `Display/to_string` has changed, and so this is a **BREAKING CHANGE**. Care has been taken that the formatting contains all necessary information (i.e. is reversible), though the actual `FromStr` implementation is still not written (it is missing on `main`, and missing in this PR - so no change). ---- Let me know if I went to far… or not far enough 😆 --------- Co-authored-by: irenjj <[email protected]>
# Rationale for this change Despite us having `Display` implementations for `DataType`, a lot of error messages still use `Debug`. See for instance: * apache/datafusion#17565 * #8290 Therefor I want to make sure the `Debug` formatting of `Field` (and, by extension, `DataType`) is not _utterly horrible_. This PR makes things… slightly better. # What changes are included in this PR? Omits fields of `Field` that have their "default" values. # Are these changes tested? Yes, there are new tests. # Are there any user-facing changes? Though this changes the `Debug` formatting, I would NOT consider this a breaking change, because nobody should rely on consistent `Debug` formatting. See for instance https://doc.rust-lang.org/std/fmt/trait.Debug.html#stability
Display
forDataType
andField
arrow-rs#8290Rationale for this change
I believe that error messages should be as readable as possible. Aim for
rustc
more thangcc
.Display
is the nice, user-facing formatter.Debug
is for… well, debugging.What changes are included in this PR?
Change a bunch of
{:?}
format string to{}
. I'm sure I missed a lot of them, because I know of no way to enforce this withoutdisallowed_methods
on trait impls rust-lang/rust-clippy#8581Are these changes tested?
I assume CI runs
cargo test
:)Are there any user-facing changes?
Yes! Error messages should be a bit more readable now, especially once this lands:
Display
forDataType
andField
arrow-rs#8290