Skip to content

Conversation

Jolanrensen
Copy link
Collaborator

@Jolanrensen Jolanrensen commented Sep 1, 2025

Fixes #998 and surpasses #999.

(Can either be for Beta3 or Beta4, whatever suits us best)

We again have three parts, but slightly modified from #999:

String-fallback for Char in convert

We get String-fallback for Char columns in df.convert().to<>()... and col.convertTo<X>(). This means you can now do things like:

enum class EnumClass { A, B }

columnOf('A', 'B').convertTo<EnumClass>()
// previously you could only do this
columnOf('A', 'B').convertToString().convertTo<EnumClass>()

Note: We do keep the old convert behavior of Char -> Int by ASCII code. This is in line with the JVM world. We have parsing now if you want the Char '1' to turn into the Int 1.

Converting from String columns can only result in enum instances (like above), value class instances, or call parse(), so we follow the same behavior for Char columns now.

Char parsing

df.parse() now not only parses string columns, but also char ones (this can be changed if you don't agree, of course).
We also gain:

columnOf('1', '2', '3').parse() // results in DataColumn<Int>
columnOf('T', 'F', 'T').parse() // results in DataColumn<Boolean>

Here I deviate from #999, as

columnOf('a', 'b', 'c').parse() // throws IllegalStateException

// just like
columnOf("aa", "bb", "cc").parse() // throws IllegalStateException

because the parsing didn't do anything. However similar to String columns, there's tryParse() which can return the same type as the input.

convertTo<> { parser {} } Char support

Fixes #998 by trying out all provided String converters if Chars are encountered without a converter.

df.convertTo<Schema>() {
    // can now be used for both String and chars
    parser { /* it: String -> */ MyCustomClass(it) }

    // will be used instead of parser {} if present
    // convert<Char>().with { /* it: Char -> */ ... }
}

I also updated the docs and expanded upon convertTo a lot to prevent confusion, like what was experienced in #998.

@Jolanrensen Jolanrensen force-pushed the implicit-char-toString-v2 branch from c0d17a6 to 3211749 Compare September 1, 2025 11:02
…generally a better explanation of how convertTo works
@Jolanrensen Jolanrensen force-pushed the implicit-char-toString-v2 branch from f785179 to 6d8d40a Compare September 1, 2025 19:22
@Jolanrensen Jolanrensen force-pushed the implicit-char-toString-v2 branch from 6d8d40a to a29f5c6 Compare September 1, 2025 19:37
@Jolanrensen Jolanrensen marked this pull request as ready for review September 1, 2025 19:37
@Jolanrensen Jolanrensen requested review from koperagen and zaleslaw and removed request for koperagen September 1, 2025 19:37
@Jolanrensen Jolanrensen modified the milestones: 1.0.0-Beta3, 1.0.0-Beta4 Sep 12, 2025
@Jolanrensen Jolanrensen requested review from zaleslaw, koperagen and AndreiKingsley and removed request for zaleslaw and koperagen September 18, 2025 09:56
Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me mostly clear, the great idea to provide PR not only with tests, but doc changes - it's really help to jump into a logic

* `Instant` (kotlinx.datetime, kotlin.time, and java.time)
* `enum` classes (by name)

Note that converting between `Char` and `Int` is done by ASCII character code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it as a Warning or NOTE - it's a very important fact

}

@Test
fun `parse to Char`() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be an opposite test, which has columnOf('a', 'b')
col.parse().type() shouldBe typeOf<String>()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that no longer happens. columnOf('a', 'b').parse() throws IllegalStateException because it cannot be parsed to any other type than Char. You could write tryParse() but then the result would still be of type Char

/**
* Tries to parse a column of chars as strings into a column of a different type.
* Each parser in [Parsers] is run in order until a valid parser is found,
* a.k.a. that parser was able to parse all values in the column successfully. If a parser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to convert single character to Enum using parser
2 participants