-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Introduce commentCharacter
property in @Csv{File}Source
#5031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Introduce commentCharacter
property in @Csv{File}Source
#5031
Conversation
.fieldSeparator(delimiter) | ||
.quoteCharacter(csvSource.quoteCharacter()) | ||
.commentStrategy(csvSource.textBlock().isEmpty() ? NONE : SKIP); | ||
.commentStrategy(csvSource.textBlock().isEmpty() ? NONE : SKIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m concerned that using CsvSource#value
results in an exception with delimiter = '#'
, even though the comment strategy for value
is NONE
, meaning that all comments ignored:
@ParameterizedTest
@CsvSource(value = "test", delimiter = '#')
void test(String str) {
}
@osiegmar, could we have FastCSV ignore a commentCharacter
that duplicates the fieldSeparator
or quoteCharacter
when CommentStrategy.NONE
is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdmitrienko I think so, yes! I can take a look at it tomorrow or the day after and let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possibly a dumb question and maybe not the right place to discuss this, but I'm going to ask anyway... 🙂
@osiegmar Even if CommentStrategy
isn't NONE
, couldn't FastCSV allow using the same char as commentCharacter
and quoteCharacter
since lines must start with commentCharacter
in order to be treated as comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a dumb question at all, because... I was thinking the exact same thing. 🤓
Is there a technical reason that FastCSV does not allow the comment character and delimiter (field separator) to be the same?
Or is that rather a potential enhancement for FastCSV to support that like the Univocity parser did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting the same character for both field separator and comment character can lead to ambiguity.
Consider content like this:
foo,bar
foo,
,bar
Now, imagine the same content, but using #
as both the field separator and the comment character:
foo#bar
foo#
#bar
The meaning in the last line is ambiguous. Is it a comment or a record with an empty first field and "bar" as the second field?
For the sake of clarity and reliability, FastCSV does not allow to use the same character for both purposes at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the example, @osiegmar.
I agree that the latter is ambiguous, even if it was supported previously in JUnit Jupiter.
And I admit: I personally find it a bit odd to use the same character for both (even if it previously worked).
So, perhaps it's best to:
- Document the change in behavior as a breaking change.
- Introduce support for configuring a custom comment character (as in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdmitrienko I’ve created the branch here: https://github.com/osiegmar/FastCSV/tree/feat/ignore-same-comment-char. Like you suggested, the comment character is ignored for CommentStrategy.NONE
.
If everything looks good on your end, I can go ahead and release it quickly.
I hereby agree to the terms of the JUnit Contributor License Agreement.
#5028
Definition of Done
@API
annotations