cc.py

Extracting URLs of a specific target based on the results of "commoncrawl.org".
Updated to v.0.2 | Whats new:

65% faster proceeding
Specify a year via -y/--year, e.g.: -y 18
Specify an output file via -o/--out, e.g.: -o whatever.txt

ToDo

Implementation of multithreading
Allowing a range of years as input
Implementing direct-grep
Temporary file-writing

Usage

cc.py [-h] [-y YEAR] [-o OUT] domain

positional arguments:
  domain                domain which will be crawled for

optional arguments:
  -h, --help            show this help message and exit
  -y YEAR, --year YEAR  limit the result to a specific year (default: all)
  -o OUT, --out OUT     specify an output file (default: domain.txt)

Example

python3 cc.py github.com -y 18 -o github_18.txt
cat github_18.txt | grep user

Dependencies

Python3

Name	Name	Last commit message	Last commit date
Latest commit dienuet Update cc.py Sep 12, 2018 1c0e93f · Sep 12, 2018 History 7 Commits
LICENSE	LICENSE	Initial commit	Jul 13, 2018
README.md	README.md	fixed typo	Jul 16, 2018
cc.py	cc.py	Update cc.py	Sep 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cc.py

About

Releases

Packages

Languages

License

dienuet/cc.py

Folders and files

Latest commit

History

Repository files navigation

cc.py

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages