Skip to content

Conversation

vdmitrienko
Copy link
Contributor

@vdmitrienko vdmitrienko commented Oct 6, 2025


I hereby agree to the terms of the JUnit Contributor License Agreement.


#5028

Definition of Done

.fieldSeparator(delimiter)
.quoteCharacter(csvSource.quoteCharacter())
.commentStrategy(csvSource.textBlock().isEmpty() ? NONE : SKIP);
.commentStrategy(csvSource.textBlock().isEmpty() ? NONE : SKIP)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m concerned that using CsvSource#value results in an exception with delimiter = '#', even though the comment strategy for value is NONE, meaning that all comments ignored:

@ParameterizedTest
@CsvSource(value = "test", delimiter = '#')
void test(String str) {
}

@osiegmar, could we have FastCSV ignore a commentCharacter that duplicates the fieldSeparator or quoteCharacter when CommentStrategy.NONE is used?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vdmitrienko I think so, yes! I can take a look at it tomorrow or the day after and let you know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possibly a dumb question and maybe not the right place to discuss this, but I'm going to ask anyway... 🙂

@osiegmar Even if CommentStrategy isn't NONE, couldn't FastCSV allow using the same char as commentCharacter and quoteCharacter since lines must start with commentCharacter in order to be treated as comments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a dumb question at all, because... I was thinking the exact same thing. 🤓

Is there a technical reason that FastCSV does not allow the comment character and delimiter (field separator) to be the same?

Or is that rather a potential enhancement for FastCSV to support that like the Univocity parser did?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting the same character for both field separator and comment character can lead to ambiguity.

Consider content like this:

foo,bar
foo,
,bar

Now, imagine the same content, but using # as both the field separator and the comment character:

foo#bar
foo#
#bar

The meaning in the last line is ambiguous. Is it a comment or a record with an empty first field and "bar" as the second field?

For the sake of clarity and reliability, FastCSV does not allow to use the same character for both purposes at the same time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the example, @osiegmar.

I agree that the latter is ambiguous, even if it was supported previously in JUnit Jupiter.

And I admit: I personally find it a bit odd to use the same character for both (even if it previously worked).

So, perhaps it's best to:

  1. Document the change in behavior as a breaking change.
  2. Introduce support for configuring a custom comment character (as in this PR).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vdmitrienko I’ve created the branch here: https://github.com/osiegmar/FastCSV/tree/feat/ignore-same-comment-char. Like you suggested, the comment character is ignored for CommentStrategy.NONE.

If everything looks good on your end, I can go ahead and release it quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants