Skip to content

Organizational Network Analysis tool for GitHub repositories - analyze collaboration patterns through PR and review data

Notifications You must be signed in to change notification settings

evalops/github-ona

Repository files navigation

GitHub Organizational Network Analysis (ONA) Tool

A Python tool for analyzing collaboration patterns and engineering productivity in GitHub repositories. Goes beyond basic metrics to provide actionable insights about bottlenecks, review times, team health, and more.

Features

🎯 Actionable Insights

  • Bottleneck Detection: Identifies stuck PRs and suggests alternative reviewers
  • Review Time Analytics: P50/P95/mean review times overall and per-reviewer
  • Bus Factor Analysis: Finds single points of failure in your review process
  • Collaboration Patterns: Identifies active reviewers, reciprocal relationships, and isolated contributors
  • Expertise Distribution: Maps who knows what based on review activity

📊 Network Metrics

  • Out-degree and in-degree centrality
  • Betweenness centrality (who bridges teams)
  • Eigenvector centrality (who reviews important reviewers)
  • Graph density and reciprocity
  • Network visualizations

🔌 Integrations

  • GitHub Actions: CI/CD metrics, build times, DORA metrics (deployment frequency, lead time, change failure rate)
  • Linear: Issue tracking, cycle times, PR-to-issue correlation
  • Slack: Real-time bottleneck alerts and daily engineering digests

See INTEGRATIONS.md for setup instructions.

🤖 AI Analysis

  • AI Code Detection: Identifies bot reviewers and AI-generated code patterns (experimental)
  • Pattern Analysis: Detects verbose comments, comprehensive docstrings, consistent formatting
  • Bot Metrics: Track AI agent contributions vs human contributions

📁 File & Directory Analysis (Monorepo Insights)

  • File-Level Metrics: Most changed files, highest churn, slowest review times
  • Directory-Level Metrics: Service/module activity, cross-team collaboration
  • Hotspot Detection: Frequently changed files that may be bottlenecks
  • Ownership Patterns: Single owner vs shared vs distributed ownership
  • Cross-Service Changes: PRs that touch multiple directories/services
  • Churn Analysis: Total code churn by file and directory

📤 Export & Reporting

  • JSON Export: Full metrics export for programmatic analysis
  • CSV Export: Multiple CSV files for Excel/BI tool integration
  • Markdown Export: Human-readable reports for documentation

Setup

  1. Create and activate virtual environment (already done):
uv venv
source .venv/bin/activate
  1. Install dependencies:
uv pip install -r requirements.txt
  1. Authenticate with GitHub (already configured):
gh auth login

Usage

Basic Usage:

# Analyze a repository (automatically uses gh token)
./run.sh --owner anthropics --repo anthropic-sdk-python --max-prs 30

# Use sample data
./run.sh --sample-file ona_github/sample_data.json

With Integrations:

# Include CI/CD metrics
./run.sh --owner evalops --repo platform --max-prs 50 --with-ci

# Include Linear issue tracking
export LINEAR_API_KEY=lin_api_xxxxx
./run.sh --owner evalops --repo platform --with-linear

# Send Slack alerts
export SLACK_BOT_TOKEN=xoxb-xxxxx
./run.sh --owner evalops --repo platform --slack-channel "#eng-team"

# All together with AI detection, file analysis, and export
./run.sh --owner evalops --repo platform --max-prs 50 \
  --with-ci --with-linear \
  --detect-ai --file-analysis \
  --export-json metrics.json \
  --export-csv ./metrics \
  --slack-channel "#eng-metrics" --send-digest

Monorepo Analysis:

# Analyze file and directory patterns (perfect for monorepos)
./run.sh --owner evalops --repo platform --file-analysis

# Identify hotspots and cross-service changes
./run.sh --owner evalops --repo platform --max-prs 100 --file-analysis

See INTEGRATIONS.md for detailed setup instructions.

Export Formats:

# Export to JSON
./run.sh --owner evalops --repo platform --export-json report.json

# Export to CSV (creates multiple files)
./run.sh --owner evalops --repo platform --export-csv ./csv_output

# Export to Markdown
./run.sh --owner evalops --repo platform --export-markdown report.md

Manual usage:

# Activate venv
source .venv/bin/activate

# Set GitHub token
export GITHUB_TOKEN=$(gh auth token)

# Run analysis
python -m ona_github.main --owner OWNER --repo REPO --max-prs 30

Example Output

The tool provides a comprehensive report with:

📊 REVIEW TIME ANALYTICS
  - Median/P95/mean time to first review
  - Top reviewers by speed

🚨 BOTTLENECK DETECTION
  - Stuck PRs with alternative reviewer suggestions
  - Hours open and who's blocking

🤝 COLLABORATION PATTERNS
  - Most active reviewers
  - Most reviewed authors
  - Strong reciprocal relationships
  - Isolated contributors

🚌 BUS FACTOR ANALYSIS
  - Single points of failure
  - Critical people dependencies

👥 EXPERTISE DISTRIBUTION
  - Top contributors by activity percentage

📈 NETWORK METRICS
  - Density, reciprocity
  - Centrality scores per contributor

📁 FILE & DIRECTORY ANALYSIS (with --file-analysis)
  - Most changed files and directories
  - Code churn hotspots
  - Cross-service/cross-directory PRs
  - Ownership patterns (single vs shared)
  - Slowest reviewing files/directories

🕸️ NETWORK VISUALIZATION
  - Visual graph of collaboration patterns

Project Structure

ona_github/
├── github_client.py    # GitHub API wrapper
├── graph_analysis.py   # Network analysis and metrics
├── insights.py         # Advanced insights (bottlenecks, review times, etc.)
├── file_analysis.py    # File/directory analysis for monorepos
├── ai_detection.py     # AI code pattern detection
├── export.py          # Multi-format export (JSON, CSV, Markdown)
├── main.py            # Entry point with enhanced reporting
├── sources/           # Integration modules (Linear, GitHub Actions, Slack)
└── sample_data.json   # Sample data for testing

What Makes This Different?

Unlike basic GitHub analytics tools, this provides:

  1. Actionable Intelligence: Not just "who reviews a lot" but "who's blocking PRs and who can help"
  2. Team Health Metrics: Bus factor, isolated contributors, collaboration patterns
  3. Predictive Insights: Review time patterns to anticipate bottlenecks
  4. Developer Experience Focus: Built to help developers get unblocked, not surveil them

Roadmap

See TODO.md for comprehensive development roadmap.

Phase 1 - Completed:

  • GitHub Actions integration with DORA metrics
  • Linear integration for issue tracking
  • Slack integration for alerts and digests
  • AI code detection (bot reviewer identification)
  • Multi-format export (JSON, CSV, Markdown)
  • File-level and directory-level analysis for monorepos

🚧 Phase 2 - In Progress:

  • Multi-repo org-wide analysis
  • Automated daily runs via GitHub Actions
  • Time-series tracking and trend analysis

📋 Phase 3 - Planned:

  • AI impact metrics (ROI of AI agents)
  • Web dashboard (Streamlit)
  • Improved CLI UX
  • Report templates

🚀 Phase 4 - Future:

  • Anomaly detection with ML
  • GitHub App for continuous monitoring
  • Predictive analytics
  • Additional integrations (Sentry, PagerDuty, Jira)

See our issues for planned features and contributions welcome!

About

Organizational Network Analysis tool for GitHub repositories - analyze collaboration patterns through PR and review data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •