Skip to content

Latest commit

 

History

History
60 lines (44 loc) · 2.92 KB

README.md

File metadata and controls

60 lines (44 loc) · 2.92 KB

Song Scraper 🎶🎵🎸🎹📄

An ultimate-guitar chord chart scraper that creates a new google doc and applies formatting.

NodeJS Pupeteer Google Drive Google Cloud

GitHub license made-with-javascript

Features

  • Accepts an ultimate-guitar url, scrapes the song data, creates a new google docs document from a template with the song name, and inserts the formatted text into the song template.
  • Removes '[ ]' brackets from section titles.
  • Recognizes various musical chords and associated notation.
  • Filters and updates formatting (bold/unbold) for section titles, chords, lyrics, and more.
  • Automatically renames the document as: 'song name - artist'.
  • Automatically opens and closes the browser.
  • Works with a variety of song formats from Ultimate Guitar, removes comments, splits the text to fit into two columns.
  • Adding the ability to choose the song key/capo. This works but you have to know how many steps up or down ahead of time

Room for Improvement

  • Adding another browser tab/window the newly created document after completion.
  • Proxy for puppeteer, to appear from a different IP address each scrape.
  • Adding human like actions like clicking, moving the mouse randomly, selecting, etc. to appear as a normal user.
  • Logic to recognize repeating chord patterns within sections, delete duplicates and move chord progression next to the section title.
  • Automatically export to PDF and download after doc creation and formatting completes.
  • Deploy to a live url, not necessary but might be useful in the future.

Tech/frameworks used

  • @google-cloud/local-auth
  • @googleapis/docs
  • @types/node
  • axios
  • cheerio
  • googleapis
  • node-fetch
  • puppeteer
  • request
  • request-promise

Motivation

Built to automate a manual process: Copying pasting chord charts from ultimate-guitar and manually formatting google docs song charts.

Screnshots

je

Middle

ssje

License

MIT © Konjo Tech - Wesley Scholl