🔍 GitHub Repository Analyzer

Analyze any GitHub repository and generate a comprehensive AI-ready report in seconds.

One command. Full codebase. Ready for AI.

Stop manually copying files to share with AI. This tool scans any GitHub repository and creates a single, well-structured text file containing the entire codebase — perfectly formatted for ChatGPT, Claude, Gemini, and other AI assistants.

بطّل تنسخ الملفات واحد واحد. الأداة دي بتحلل أي مستودع على GitHub وبتطلعلك ملف واحد فيه كل الأكواد — جاهز تبعته للذكاء الاصطناعي مباشرة.

🇬🇧 English Documentation

⚡ Quick Start

One command — works in Google Colab, Kaggle, or any terminal:

!git clone https://github.com/YousefAutomates/repo-analyzer.git && cd repo-analyzer && pip install -q -r requirements.txt && python run.py

That's it. The tool will guide you through the rest interactively.

🎯 What Does This Tool Do?

This tool scans any GitHub repository and generates a single comprehensive TXT file containing:

Section	What's Included
📋 Repository Info	Name, description, stars, forks, license, topics, visibility
🌳 Project Structure	Complete visual directory tree
📊 Statistics	Total lines of code, file sizes, language breakdown
📦 Dependencies	Parsed from requirements.txt, package.json, go.mod, etc.
📄 ALL Source Code	Full content of every text file with syntax highlighting
👥 Contributors	Who contributed and how many commits
📝 Commit History	Latest changes with authors and dates
🤖 AI Token Estimate	Shows compatibility with GPT-4, Claude, Gemini context windows
📌 AI Prompt Template	Ready-to-use instructions for your AI request

💡 Why Use This?

The Problem

When you want AI to understand, modify, or build upon an existing project, you need to share the complete codebase. Manually copying dozens of files is:

⏰ Time-consuming — especially for large projects
😵 Error-prone — you might miss important files
🔀 Unstructured — AI loses context without proper organization
📏 Uncertain — you don't know if it fits the AI's context window

The Solution

This tool does everything automatically in seconds:

1. 🔍 Run the analyzer on any GitHub repo
2. 📄 Get a single, perfectly structured TXT file
3. 📋 Copy and paste into your AI assistant
4. 💬 Add your request (modify, explain, fix, improve)
5. ✨ Get AI-powered results!

🌟 Key Features

🔐 Security

Feature	Description
🔑 Secure Token Input	Token is hidden when you type (uses `getpass`)
🌍 Environment Variables	Supports `GITHUB_TOKEN` env variable
🧹 Memory Cleanup	Token is cleared from memory after use
✅ Permission Check	Validates token scopes before scanning

🛡️ Reliability

Feature	Description
🔄 Auto Retry	Automatically retries failed API requests (3 attempts)
⏳ Rate Limit Handling	Detects rate limits and waits automatically
🌐 Connection Recovery	Handles network timeouts gracefully
⚠️ Error Messages	Clear, helpful error messages for every situation

📊 Smart Analysis

Feature	Description
🧮 Token Estimation	Estimates token count for AI context windows
📦 Dependency Parsing	Analyzes requirements.txt, package.json, go.mod, etc.
🚫 Smart Filtering	Automatically skips node_modules, pycache, .git, etc.
📏 Size Limits	Configurable max file size (default 500KB)
🏷️ File Categorization	Groups files by language/type

🖥️ User Experience

Feature	Description
⚡ Smart Defaults	Optimal settings out of the box — just press Enter
📂 3 Modes	Public repo, private repo, or browse your repos
🌿 Branch Selection	Analyze any branch, not just main
🔍 Repo Browser	Search, filter, and paginate through your repos
⬇️ Auto Download	File downloads automatically in Colab/Kaggle

🤖 AI Compatibility

AI Model	Context Window	Status
GPT-3.5 Turbo	16K tokens	✅ Small projects
GPT-4 / GPT-4o	128K tokens	✅ Most projects
Claude 3.5 Sonnet	200K tokens	✅ Large projects
Gemini 1.5/2.0	1M tokens	✅ Very large projects

📖 Usage Guide

Mode 1: Analyze a Public Repository (No Token)

Choice: 1
Repository: facebook/react

That's it! No authentication needed for public repos.

Mode 2: Analyze a Private Repository

Choice: 2
Token: ghp_xxxxxxxxxxxx (hidden input)
Repository: mycompany/private-api

Mode 3: Browse & Select Your Repos

Choice: 3
Token: ghp_xxxxxxxxxxxx (hidden input)
→ Shows your repos with search, filter, and pagination
→ Select by number

🔑 Creating a GitHub Token

Go to github.com/settings/tokens
Click "Generate new token (classic)"
Give it a name (e.g., "repo-analyzer")
Select scopes:
- ✅ repo — Full control (required for private repos)
- ✅ read:user — Read user info (optional)
Click "Generate token"
Copy the token immediately (you won't see it again!)

💡 Tip: For public repos only, you don't need a token at all!

💡 Tip: You can set the GITHUB_TOKEN environment variable to skip the input prompt.

🤖 Using the Report with AI

Step 1: Generate the Report

Run the tool and get your TXT file.

Step 2: Send to AI

Open the TXT file, copy ALL content, and paste it into your AI assistant.

Step 3: Add Your Request

After pasting, add your specific request:

I've shared my complete project above. Please:

1. Explain how the authentication flow works
2. Find any security vulnerabilities
3. Add rate limiting to the API endpoints
4. Write unit tests for the user module
5. Suggest performance improvements

Please provide the modified code for each change.

💡 Creative Use Cases

Use Case	Description
🔍 Code Review	Ask AI to review the entire codebase for bugs and improvements
📝 Documentation	Generate comprehensive docs from the code
🔒 Security Audit	Find vulnerabilities and security issues
🧪 Test Generation	Auto-generate unit tests for all modules
🔄 Language Conversion	Convert a Python project to JavaScript (or vice versa)
📚 Learning	Understand how an open-source project works
👋 Onboarding	Help new developers understand the codebase quickly
🏗️ Architecture Review	Get AI feedback on project structure and design patterns
📊 Comparison	Generate reports for 2 projects and ask AI to compare them
♻️ Refactoring	Get suggestions for cleaner, more maintainable code

📁 Project Structure

repo-analyzer/
├── run.py                      # Entry point with dependency management
├── requirements.txt            # Python dependencies
├── LICENSE                     # MIT License
├── README.md                   # This file
├── .gitignore                  # Git ignore rules
└── analyzer/                   # Main package
    ├── __init__.py             # Package init (version info)
    ├── github_client.py        # GitHub API client with retry & rate limiting
    ├── repo_scanner.py         # Repository scanning & file analysis
    ├── report_generator.py     # AI-ready report generation
    ├── file_exporter.py        # TXT export with auto-download
    └── main.py                 # Interactive CLI application

⚙️ Configuration

Default Settings (Optimal)

Setting	Default	Description
Output Format	TXT	Best for AI consumption
Max File Size	500 KB	Skip files larger than this
Max Files	500	Limit total files to scan
Include Contents	Yes	Include full source code
Include Stats	Yes	Include statistics section
Include Commits	Yes	Include commit history
Include Contributors	Yes	Include contributors list
Auto Download	Yes	Download file in Colab/Kaggle

Skipped Directories (Automatic)

These directories are automatically excluded from scanning:

node_modules/    __pycache__/    .git/           vendor/
venv/            dist/           build/          .idea/
.vscode/         coverage/       .tox/           .terraform/

Environment Variables

Variable	Description
`GITHUB_TOKEN`	GitHub personal access token (skips input prompt)

🛠️ Technical Details

Architecture

┌─────────────────────────────────────────────────┐
│                  RepoAnalyzerApp                │
│                   (main.py)                     │
│            Interactive CLI Interface            │
├─────────┬──────────┬─────────────┬──────────────┤
│ GitHub  │   Repo   │   Report    │    File      │
│ Client  │ Scanner  │ Generator   │  Exporter    │
│         │          │             │              │
│ • Auth  │ • Tree   │ • Header    │ • TXT Export │
│ • Retry │ • Filter │ • Stats     │ • Auto-DL    │
│ • Rate  │ • Scan   │ • Code      │ • Colab      │
│ • API   │ • Stats  │ • Tokens    │ • Kaggle     │
└─────────┴──────────┴─────────────┴──────────────┘

API Rate Limits

Authentication	Rate Limit
No token	60 requests/hour
With token	5,000 requests/hour

The tool automatically handles rate limiting. If you hit the limit, it will wait and retry.

Dependencies

Package	Purpose
`requests`	HTTP requests to GitHub API
`rich`	Enhanced terminal output (optional)

❓ FAQ

Can I analyze private repositories?

Yes! Use Mode 2 or Mode 3 with a GitHub token that has repo scope.

What if the repo is too large for AI?

The tool shows token estimates and compatibility with each AI model. For very large repos, consider:

Reducing max file size
The tool automatically skips binary files and common junk directories
Use a model with a larger context window (Gemini 1.5 Pro supports 1M tokens)

Is my token safe?

Yes. The token is:

Hidden during input (not displayed on screen)
Cleared from memory after the scan completes
Never stored to disk or logged
You can also use the GITHUB_TOKEN environment variable

What file formats are supported?

The tool generates TXT files, which are the most compatible format for AI assistants. The report includes markdown-style code blocks with syntax highlighting hints.

Can I use this on my local machine?

Yes! Clone the repo and run python run.py in your terminal. It works on Windows, macOS, and Linux.

Does it work with GitLab or Bitbucket?

Currently, only GitHub repositories are supported. GitLab and Bitbucket support is planned for future versions.

🗺️ Roadmap

🤝 Contributing

Contributions are welcome! Feel free to:

🍴 Fork this repository
🌿 Create a feature branch (git checkout -b feature/amazing-feature)
💾 Commit your changes (git commit -m 'Add amazing feature')
📤 Push to the branch (git push origin feature/amazing-feature)
🔄 Open a Pull Request

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

🇸🇦 التوثيق بالعربي

⚡ البداية السريعة

أمر واحد — يشتغل في Google Colab أو Kaggle أو أي Terminal:

!git clone https://github.com/YousefAutomates/repo-analyzer.git && cd repo-analyzer && pip install -q -r requirements.txt && python run.py

وبس كده! الأداة هتوجهك خطوة بخطوة.

🎯 إيه اللي بتعمله الأداة دي؟

الأداة بتعمل تحليل شامل لأي مستودع على GitHub وبتطلع ملف TXT واحد فيه كل حاجة:

القسم	المحتوى
📋 معلومات المستودع	الاسم والوصف والنجوم والفوركات والرخصة والتوبيكات
🌳 هيكل المشروع	شجرة المجلدات والملفات كاملة
📊 إحصائيات	عدد أسطر الكود وأحجام الملفات واللغات المستخدمة
📦 المكتبات	تحليل requirements.txt و package.json وغيرهم
📄 كل الأكواد	محتوى كل ملف كود بالكامل مع تنسيق syntax
👥 المساهمين	مين ساهم وبكام commit
📝 سجل التعديلات	آخر التعديلات بالمؤلف والتاريخ
🤖 تقدير التوكنات	هل الملف هيدخل في GPT-4 ولا Claude ولا Gemini
📌 قالب للـ AI	تعليمات جاهزة تكتب فيها طلبك

💡 ليه تستخدم الأداة دي؟

المشكلة

لما تحب تطلب من الذكاء الاصطناعي يفهم أو يعدّل مشروع، محتاج تبعتله كل الأكواد. نسخ الملفات واحد واحد:

⏰ بياخد وقت كتير — خصوصاً المشاريع الكبيرة
😵 ممكن تنسى ملفات — وتضيع السياق
🔀 مش منظم — الـ AI مش هيفهم بدون تنظيم
📏 مش متأكد — هل الكود هيدخل في context الـ AI ولا لأ

الحل

الأداة دي بتعمل كل ده أوتوماتيك في ثواني:

1. 🔍 شغّل الأداة على أي ريبو
2. 📄 هتاخد ملف TXT واحد منظم
3. 📋 انسخ المحتوى في ChatGPT / Claude / Gemini
4. 💬 اكتب طلبك (عدّل، اشرح، صلّح، حسّن)
5. ✨ خد النتيجة!

🌟 المميزات الرئيسية

🔐 الأمان

🔑 التوكن مخفي — مش بيظهر وأنت بتكتبه
🌍 Environment Variable — ممكن تحط التوكن في متغير بيئة
🧹 تنظيف الذاكرة — التوكن بيتمسح بعد الاستخدام

🛡️ الموثوقية

🔄 إعادة المحاولة — لو الاتصال فشل بيحاول 3 مرات
⏳ Rate Limit — بيستنى تلقائياً لو وصل للحد الأقصى
🌐 استقرار — بيتعامل مع مشاكل الشبكة بذكاء

📊 تحليل ذكي

🧮 تقدير التوكنات — بيحسبلك هل الملف هيدخل في الـ AI ولا لأ
📦 تحليل المكتبات — بيقرأ requirements.txt و package.json
🚫 فلترة ذكية — بيتجاهل node_modules و pycache تلقائياً

🖥️ سهولة الاستخدام

⚡ إعدادات مثالية — بس اضغط Enter وهو يشتغل
📂 3 أوضاع — ريبو عام أو خاص أو تصفح ريبوزاتك
⬇️ تحميل تلقائي — الملف بينزل أوتوماتيك في Colab

🔑 إنشاء توكن GitHub

روح على github.com/settings/tokens
اضغط "Generate new token (classic)"
سمّيه (مثلاً "repo-analyzer")
اختار الصلاحيات:
- ✅ repo — تحكم كامل (لازم للريبوز الخاصة)
اضغط "Generate token"
انسخ التوكن فوراً (مش هيظهرلك تاني!)

💡 للريبوز العامة مش محتاج توكن خالص!

🤖 الاستخدام مع الذكاء الاصطناعي

الخطوة 1: ولّد التقرير

شغّل الأداة وخد ملف الـ TXT.

الخطوة 2: ابعت للـ AI

افتح الملف وانسخ كل المحتوى والصقه في ChatGPT أو Claude أو Gemini.

الخطوة 3: اكتب طلبك

بعد ما تلصق المحتوى، اكتب اللي عاوزه:

ده مشروعي الكامل بكل الأكواد. عاوز:

1. اشرحلي إزاي الـ authentication بيشتغل
2. دوّر على أي مشاكل أمنية
3. أضف rate limiting للـ API
4. اكتبلي unit tests للـ user module
5. اقترح تحسينات للأداء

اديني الكود المعدّل لكل تغيير.

💡 استخدامات إبداعية

الاستخدام	الوصف
🔍 مراجعة الكود	الـ AI يراجعلك الكود كله ويلاقي الأخطاء
📝 توثيق تلقائي	يولّدلك documentation كاملة من الكود
🔒 فحص أمني	يدوّر على الثغرات الأمنية
🧪 كتابة Tests	يكتبلك unit tests لكل الموديولات
🔄 تحويل اللغة	يحوّلك المشروع من Python لـ JavaScript أو العكس
📚 تعلّم	يشرحلك إزاي مشروع open source بيشتغل
👋 Onboarding	يساعد المطورين الجدد يفهموا الكود بسرعة
🏗️ مراجعة معمارية	يديك رأيه في بنية المشروع
♻️ Refactoring	يقترح كود أنظف وأسهل في الصيانة

❓ أسئلة شائعة

هل يقدر يحلل ريبوز خاصة؟

أيوا! استخدم الوضع 2 أو 3 مع توكن GitHub عنده صلاحية repo.

لو الريبو كبير أوي؟

الأداة بتعرض تقدير التوكنات وتوافق كل AI. للريبوز الكبيرة جداً:

الأداة بتتجاهل الملفات الثنائية والمجلدات الغير مهمة تلقائياً
ممكن تقلل حجم الملف الأقصى
استخدم Gemini 1.5 Pro (بيدعم مليون توكن)

هل التوكن بتاعي آمن؟

أيوا! التوكن:

مش بيظهر على الشاشة وأنت بتكتبه
بيتمسح من الذاكرة بعد الاستخدام
مش بيتحفظ على الهارد ولا بيتسجل في أي log

بيشتغل على GitLab أو Bitbucket؟

حالياً بيدعم GitHub بس. دعم GitLab و Bitbucket مخطط في النسخ القادمة.

🗺️ خطة المستقبل

📄 الرخصة

المشروع مرخّص تحت رخصة MIT — شوف ملف LICENSE للتفاصيل.

👤 About the Author

Yousef Elsherbiny

Automation & AI Solutions

Built with ❤️ by YousefAutomates

⭐ If this tool saved you time, please star the repo! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
analyzer		analyzer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Folders and files

Latest commit

History

Repository files navigation

🔍 GitHub Repository Analyzer

🇬🇧 English Documentation

⚡ Quick Start

🎯 What Does This Tool Do?

💡 Why Use This?

The Problem

The Solution

🌟 Key Features

🔐 Security

🛡️ Reliability

📊 Smart Analysis

🖥️ User Experience

🤖 AI Compatibility

📖 Usage Guide

Mode 1: Analyze a Public Repository (No Token)

Mode 2: Analyze a Private Repository

Mode 3: Browse & Select Your Repos

🔑 Creating a GitHub Token

🤖 Using the Report with AI

Step 1: Generate the Report

Step 2: Send to AI

Step 3: Add Your Request

💡 Creative Use Cases

📁 Project Structure

⚙️ Configuration

Default Settings (Optimal)

Skipped Directories (Automatic)

Environment Variables

🛠️ Technical Details

Architecture

API Rate Limits

Dependencies

❓ FAQ

🗺️ Roadmap

🤝 Contributing

📄 License

🇸🇦 التوثيق بالعربي

⚡ البداية السريعة

🎯 إيه اللي بتعمله الأداة دي؟

💡 ليه تستخدم الأداة دي؟

المشكلة

الحل

🌟 المميزات الرئيسية

🔐 الأمان

🛡️ الموثوقية

📊 تحليل ذكي

🖥️ سهولة الاستخدام

🔑 إنشاء توكن GitHub

🤖 الاستخدام مع الذكاء الاصطناعي

الخطوة 1: ولّد التقرير

الخطوة 2: ابعت للـ AI

الخطوة 3: اكتب طلبك

💡 استخدامات إبداعية

❓ أسئلة شائعة

🗺️ خطة المستقبل

📄 الرخصة

👤 About the Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages