Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ssv output format #3306

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add ssv output format #3306

wants to merge 7 commits into from

Conversation

HenrikHolst
Copy link

Currently jq supports output in CSV format, however in many parts of Europe the CSV format is in fact semicolon and not comma separated. So I've added a small patch that adds support for semicolon separated CSV files (SSV) while also clarified in the doc and man page that the CSV version is the comma separated one.

Added support to write @ssv files (semicolon separated CSV files)
Added info in the man page about the @ssv format and also clarified that @csv is comma separated.
Added info about the  @ssv format in the manual and also clarified that @csv is comma separated.
Added a test case for the @ssv format
@HenrikHolst
Copy link
Author

I don't fully understand what went wrong with the man page check, if any one could point me in the correct direction I would be happy to fix it.

@itchyny
Copy link
Contributor

itchyny commented Apr 7, 2025

I'm afraid ssv is not a common technical term for this format. I think it's one of a DSV (Delimiter-separated values) (oh, I've never heard of this but this term seems to be appear in The Art of Unix Programming). Quick googling ssv leads me to space-separated values or separator-separated values, or even something-separated value. CSV is very common and has RFC specification (RFC-4180), we'd like to avoid uncommon terms and specification.

@HenrikHolst
Copy link
Author

I'm afraid ssv is not a common technical term for this format. I think it's one of a DSV (Delimiter-separated values) (oh, I've never heard of this but this term seems to be appear in The Art of Unix Programming). Quick googling ssv leads me to space-separated values or separator-separated values, or even something-separated value. CSV is very common and has RFC specification (RFC-4180), we'd like to avoid uncommon terms and specification.

I was looking for a term to use since the semicolon variant usually is just named CSV here in Europe, desptite it being a RFC, Microsoft in their magic thinking used semicolon as a separator for CSV from Excel over here due to us using comma as the decimal separator.

First I was thinking about something like "csv-semi" or if perhaps there should be a setting for the delimiter used but then I stumbled across sites like this: https://www.dell.com/support/manuals/sv-se/dell-opnmang-srvr-admin-v7.4/omsa_cli-v3/semicolon-separated-values-ssv?guid=guid-7ae0aea0-ff58-49aa-b2f5-3931eed1b033&lang=en-us as well as numerous github issues asking various projects to support the SSV format so I thought then and there that SSV although ugly seemed to be a very well known name for the format.

DSV is "something else", SSV is exactly CSV with the sole difference that ; is used over , all other rules like escaping with just " is the same (due to the source being MS Excel once upon a time).

But the name is for me unimportant and I will happily change it to anything, it's the functionality that I'm after.

@pkoppstein
Copy link
Contributor

Would it be feasible to support @dsv(delim), so that one could write, for example, @dsv(";")

@HenrikHolst
Copy link
Author

Would it be feasible to support @dsv(delim), so that one could write, for example, @dsv(";")

did a quick look at this and there are some issues, for one bash is not happy with () so it has to be inside single quotes like '@dsv()' and once you do jq starts to complain about syntax error since jq does not like the () at all and jq is also not at all happy with any other separator like ;, :, - or , so jq appears to have very strict syntax here (which ofc is good) that makes it very hard to give arguments to any of the output formats.

Which is why I stand by that @ssv is the better option, or say @csv_semi or @csv_ssv. I'm open for all suggestions here.

@pkoppstein
Copy link
Contributor

Since @SSV is ambiguous (between space or semicolon) and @DSV(delim) presents implementation issues, the obvious choice is @SCSV. This is confirmed by ChatGPT's advice:

The most consistent and intuitive abbreviation for "semicolon-separated values" in the context of DSV (Delimiter-Separated Values) is:

SCSV

This follows the same pattern as:

CSV – Comma-Separated Values

TSV – Tab-Separated Values

SSV – Space-Separated Values

So:

SCSV – Semicolon-Separated Values

This abbreviation is already in informal use in some data communities and tools, even if it's not as standardized or widespread as CSV/TSV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants