Download every README.md from any GitHub user's repositories — including nested READMEs inside monorepos — with a single command.
You want to read, analyze, or archive the documentation from all of a GitHub user's public repositories. Doing this manually means opening every repo, navigating into every subdirectory, and copying files one by one. For users with dozens of repos — or repos with monorepo structures containing multiple READMEs — this is impractical.
This tool automates the entire process: one command, one username, everything downloaded and organized locally.
# 1. Clone and install
git clone https://github.com/johnlester-0369/github-readme-scraper.git
cd github-readme-scraper
npm install
# 2. Run
node index.js johnlester-0369That's it. README files appear in ./downloads/johnlester-0369/.
- Node.js v14 or later
- Internet access to reach
api.github.comandraw.githubusercontent.com - No GitHub account or token required for public repositories
Files are saved under ./downloads/ and preserve the exact directory structure from each repository:
downloads/
└── {username}/
└── {repo-name}/
├── README.md ← root-level README
├── packages/
│ └── server/
│ └── README.md ← monorepo package README
└── docs/
└── README.md ← nested docs README
The downloads/ directory is excluded from version control via .gitignore.
The scraper runs five operations in sequence for each repository:
┌─────────────────────────────────────┐
│ node index.js <username> │
└──────────────────┬──────────────────┘
│
▼
┌───────────────┐
│ main() │
└───────┬───────┘
│
▼
┌───────────────────────────────┐
│ fetchUserRepos() │──► GitHub API
└───────────────┬───────────────┘
│ repos[]
▼
┌───────────────────────────────┐
│ fetchRepoTree() │──► GitHub API
└───────────────┬───────────────┘
│ readme paths[]
▼
┌───────────────────────────────┐
│ downloadReadme() │──► raw.githubusercontent.com
└───────────────┬───────────────┘
│
▼
┌───────────────────────────────┐
│ ensureDownloadPath() │──► ./downloads/{username}/
└───────────────────────────────┘
- Fetch repository list — calls
/users/{username}/reposwithper_page=100, sorted by last updated - Get complete file tree — calls
/repos/{username}/{repo}/git/trees/{branch}?recursive=1to retrieve every file path in the repo in a single request - Filter for READMEs — selects all
blobentries whose path ends withreadme.md(case-insensitive), catchingREADME.md,Readme.md,README.MD, and any other casing variation - Download raw content — constructs a
raw.githubusercontent.comURL for each README and writes the file to./downloads/{username}/{repo}/{original-path} - Branch fallback — if the
mainbranch returns a 404, the tool automatically retries withmaster, both for the tree fetch and the raw file download
| Topic | Detail |
|---|---|
| Rate limiting | Unauthenticated GitHub API requests are capped at 60 per hour. Each repository requires at least one API call for the file tree, so large accounts will hit this limit. |
| Repositories per fetch | The tool fetches up to 100 repositories per user in a single request. Users with more than 100 repos will have only their 100 most recently updated repos scraped. |
| Tree size | GitHub's API limits recursive tree responses to approximately 100,000 entries per repository. Extremely large monorepos may return a truncated tree. |
| Private repositories | No authentication is configured. Only public repositories are accessible. |
| Branch support | Only main and master branches are attempted. Repositories using other default branch names will be skipped with an error logged to the console. |
Open an issue or pull request on GitHub.
ISC