There are two parameters on start:
- URL to start the web-crowler
- Configuration file (optional)
Run "make run http://example.com" to start the web-crowler.
Run "make build" to build and run the artifact from ./build directory.
The default configuration file is located in ./configs/config.yaml by default
parallelism: 10 # number of parallel requests
acceptable_mime_types: # acceptable mime types on resolving response
- text/html
- application/json
- application/xml
- text/css
database_file: ./crawler.db # path for the database file
api_addr: localhost:8080 # address for the APIThe stataistics API is available on http://localhost:8080/ (just a few counters which are barely useful)
Run "make test" to run the tests (TWO tests).