-
Notifications
You must be signed in to change notification settings - Fork 83
Description
I hardly ever use xml files. But I'm no stranger to R or nested hierarchies (like R's lists). Today I ended up with a big xml file in my lap that I wanted to explore. I thought, "oh, I think I should try xml2 for understanding this!"
Um... I struggled with this for a few hours and I found the package and the documentation completely impenetrable because I don't speak the xml jargon. I am really confident that if a user understands xml conceptually and has used other xml tools then {xml2} is super useful. That's not me.
So here's my proposal: A vignette on exploring a random xml file using {xml2} with examples of how to tell what's in the file, how to pull elements out, how to pull elements out and pop them in a data frame, etc. A brief introduction to xml extraction for folks who are used to dealing with data frames, if you will.
FWIW, the data I was wrestling with was this 1.5gb xml file of music artists from discogs.com: https://discogs-data.s3-us-west-2.amazonaws.com/data/2019/discogs_20191201_artists.xml.gz