Skip to content

Documentation Needed: Explore an xml file with xml2Β #282

@CerebralMastication

Description

@CerebralMastication

I hardly ever use xml files. But I'm no stranger to R or nested hierarchies (like R's lists). Today I ended up with a big xml file in my lap that I wanted to explore. I thought, "oh, I think I should try xml2 for understanding this!"

Um... I struggled with this for a few hours and I found the package and the documentation completely impenetrable because I don't speak the xml jargon. I am really confident that if a user understands xml conceptually and has used other xml tools then {xml2} is super useful. That's not me.

So here's my proposal: A vignette on exploring a random xml file using {xml2} with examples of how to tell what's in the file, how to pull elements out, how to pull elements out and pop them in a data frame, etc. A brief introduction to xml extraction for folks who are used to dealing with data frames, if you will.

FWIW, the data I was wrestling with was this 1.5gb xml file of music artists from discogs.com: https://discogs-data.s3-us-west-2.amazonaws.com/data/2019/discogs_20191201_artists.xml.gz

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions