Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV.parse - Support Headers #15170

Open
stellarpower opened this issue Nov 6, 2024 · 3 comments
Open

CSV.parse - Support Headers #15170

stellarpower opened this issue Nov 6, 2024 · 3 comments
Labels
good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. kind:feature topic:stdlib:text

Comments

@stellarpower
Copy link
Contributor

Feature Request

Whilst it's good that CSV is implemented as an efficient parser that iterates over data as it's needed, sometimes it's convenient to read the lot in as one big array of arrays.

The constructor for CSV has an optional headers argument for handling a header row automatically, whilst the parse static method returns an array of arrays of strings for grabbing the thing as a table. Often I'd like to get a table in one line, but also use a header row.

So I'd like to propose adding this to the parse method, and I think the most suitable option would be, if headers is true in parse, to return an array of hashes, so that I can index by row and then extract the cell through its column key from the hash.

Thanks

@Blacksmoke16 Blacksmoke16 added good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. topic:stdlib:text labels Nov 6, 2024
@Blacksmoke16
Copy link
Member

Would probably have to be a new overload, as adding it to the existing .parse method would change its return type and be a breaking change.

@AgileIndustrialComplex
Copy link

@stellarpower you should be able to achieve this with code like

require "csv"

csv = CSV.new("Name, Age\nJohn, 20\nPeter, 30", headers: true)
rows = [] of Hash(String, String)
while csv.next
    rows << csv.row.to_h
end

@stellarpower
Copy link
Contributor Author

I can thanks, and in my project I extended the base class to do it, it's just nice to have a one-liner in the standard library. I have to jump through multiple languages in a day, and I'd say 80+% of the Crystal I write involves CSV, JSON/YAML, or less often XML, and it's nice to be able to read in things like that with minimal overhead. If I've just been working in Python say, either I need to flip my brain and remember it or it's a bit of extra typing that adds noise (File.read is vastly nicer than with open(...) as ..., and don't even get me started on simply flattening a list!).

I guess similarly a file keyword argument would be nice, so I don't have to bother opening it myself either. And it'd also be convenient to take an array of array of (things that are safe to serialise) or an array of hashes to (things that are safe to serialise) and just output that to file without needing to use the builder and iterate. Perhaps there's a more general discussion here on CSV. I know there have been efforts to keep the standard library slim, but also these one-line solutions to things that mean there's less you need to hold in your brain are part of what always made Crystal and Ruby nice to work in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. kind:feature topic:stdlib:text
Projects
None yet
Development

No branches or pull requests

3 participants