Skip to content

swarnimbandekar/ArchiveDownloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Web Archive Downloader

A simple Python CLI tool to download files from a list of URLs (e.g., from the Wayback Machine or web archives). Files are saved directly into a specified output folder without creating extra subdirectories.


πŸ“¦ Features

  • Download files from URLs in a .txt file
  • Save all files in a single output directory
  • Automatically handles missing filenames
  • Basic error handling with clean output

πŸš€ Installation

  1. Clone this repository or download the script:
    git clone https://github.com/swarnimbandekar/ArchiveDownloader.git
    cd ArchiveDownloader
  2. Install the required dependencies:
    pip install -r requirements.txt

πŸ§ͺ Usage

python3 archivedownloader.py -l urls.txt -o output/

Arguments:

  • -l, --list: Path to a text file containing URLs (one per line)
  • -o, --output: Directory where downloaded files will be saved

Example: If urls.txt contains:

   https://web.archive.org/web/20220101000000/https://example.com/file1.pdf
   https://web.archive.org/web/20220101000000/https://example.com/image.jpg

The files will be saved in the output/ folder.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages