This repository is about using tools such as Selenium and Puppeteer to crawl data from websites.
In the Puppeteer
directory there are 3 files as follow:
get_links.js
: get all the links to different categories on Amazon.scrape.js
: scrape data from one link above.cluster.js
: run multiple pages at a time to get all products in the categories.