Skip to content

Fix fork divergence: add clean upstream PR workflow, fix rebase merge-base bug, generalize sync backup#84

Merged
carstenartur merged 2 commits into
masterfrom
copilot/add-create-upstream-pr-workflow
Feb 28, 2026
Merged

Fix fork divergence: add clean upstream PR workflow, fix rebase merge-base bug, generalize sync backup#84
carstenartur merged 2 commits into
masterfrom
copilot/add-create-upstream-pr-workflow

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 28, 2026

Fork-specific commits were leaking into upstream PRs because the rebase workflow counted commits via the GitHub API (which includes all fork-deviation commits), and the sync workflow only preserved .github/workflows/ files, missing README.md, dependabot.yml, etc.

Changes

  • New: .github/workflows/create-upstream-pr.yml
    workflow_dispatch workflow to create clean upstream PR branches. Accepts issue_number, source_branch, commit_shas (comma-separated), and optional pr_title. Cherry-picks only the specified SHAs onto a fresh upstream/master-based branch (upstream-pr/issue-{N}), runs verification (git log + git diff --stat), force-pushes, and outputs the direct PR comparison URL:

    https://github.com/eclipse-jdt/eclipse.jdt.ui/compare/master...carstenartur:eclipse.jdt.ui:upstream-pr/issue-{N}
    
  • Fix: .github/workflows/rebase-upstream.yml — rebase step
    Replaced HEAD~${PR_COMMIT_COUNT} (broken: count from GitHub API includes fork-specific commits) with git merge-base to find the true fork point:

    MERGE_BASE=$(git merge-base HEAD upstream/master)
    ACTUAL_COMMITS=$(git rev-list --count $MERGE_BASE..HEAD)
    git rebase --onto upstream/master $MERGE_BASE

    Also removed the now-unused commit_count output from the "Get PR branch info" step.

  • Fix: .github/workflows/sync-upstream.yml — backup/restore steps (both scheduled and manual jobs)
    Replaced the dynamic upstream-diff scan (only caught .github/workflows/ files) with a manifest-driven approach: reads .github/fork-specific-files.txt line by line, backs up each listed file before reset, restores with cp -r + git add -A to handle any path. Uses POSIX-compatible case syntax instead of [[.

  • New: .github/fork-specific-files.txt
    Explicit manifest of fork-specific files to preserve across upstream syncs:

    .github/dependabot.yml
    .github/workflows/codacy.yml
    .github/workflows/codeql.yml
    .github/workflows/maven.yml
    .github/workflows/rebase.yml
    .github/workflows/rebase-upstream.yml
    .github/workflows/sync-upstream.yml
    .github/workflows/create-upstream-pr.yml
    .github/fork-specific-files.txt
    README.md
    

    The manifest includes itself so it survives sync resets.

Original prompt

Problem

The fork carstenartur/eclipse.jdt.ui has a persistent divergence from eclipse-jdt/eclipse.jdt.ui. When PRs are opened from fork branches against upstream, fork-specific commits leak into the PR. This happened with PR #2779 which was supposed to only contain a fix for issue eclipse-jdt#1865 (overload resolution detection in Move Instance Method), but instead included 38 commits and 16 changed files — 10 of which were fork-specific (workflows, badges, pom changes, etc.).

Two Things to Fix

1. Add a create-upstream-pr.yml workflow

Create .github/workflows/create-upstream-pr.yml — a workflow_dispatch workflow that automates creating clean upstream PR branches. It should:

  • Accept inputs: issue_number, source_branch, commit_shas (comma-separated SHAs to cherry-pick), and pr_title
  • Fetch upstream/master from https://github.com/eclipse-jdt/eclipse.jdt.ui.git
  • Create a branch upstream-pr/issue-{issue_number} based on upstream/master
  • Cherry-pick only the specified commit SHAs onto this clean branch
  • Force-push the branch to the fork
  • Output a direct link to create the PR on upstream: https://github.com/eclipse-jdt/eclipse.jdt.ui/compare/master...carstenartur:eclipse.jdt.ui:upstream-pr/issue-{issue_number}
  • Add a verification step that runs git log upstream/master..HEAD --oneline and git diff upstream/master..HEAD --stat so the user can see exactly what will be in the PR

2. Fix the existing rebase-upstream.yml workflow

The current rebase-upstream.yml has a bug: it uses HEAD~${PR_COMMIT_COUNT} to determine which commits to rebase, but PR_COMMIT_COUNT comes from the GitHub API which counts ALL commits that differ between the PR branch and upstream — including fork-deviation commits. This means it replays fork-specific commits onto upstream.

Fix the rebase step in .github/workflows/rebase-upstream.yml to use git merge-base instead:

Replace the current rebase logic (lines 142-165):

git remote add upstream https://github.com/eclipse-jdt/eclipse.jdt.ui.git
git fetch upstream master

PR_COMMIT_COUNT=${{ steps.pr.outputs.commit_count }}
...
git rebase --onto upstream/master HEAD~${PR_COMMIT_COUNT}

With:

git remote add upstream https://github.com/eclipse-jdt/eclipse.jdt.ui.git
git fetch upstream master

# Find the actual fork point with upstream
MERGE_BASE=$(git merge-base HEAD upstream/master)
echo "Merge base with upstream: $MERGE_BASE"

# Count actual commits to rebase (between merge-base and HEAD)
ACTUAL_COMMITS=$(git rev-list --count $MERGE_BASE..HEAD)
echo "Actual commits to rebase: $ACTUAL_COMMITS"

if [ "$ACTUAL_COMMITS" -eq 0 ]; then
  echo "Error: No commits to rebase"
  echo "success=false" >> $GITHUB_OUTPUT
  exit 0
fi

# Rebase using merge-base (correct: only replays commits after the fork point)
if git rebase --onto upstream/master $MERGE_BASE; then
  echo "success=true" >> $GITHUB_OUTPUT
else
  echo "success=false" >> $GITHUB_OUTPUT
  git rebase --abort
fi

Also remove the commit_count output from the "Get PR branch info" step since it's no longer needed by the rebase logic (but keep the rest of that step intact).

3. Improve sync-upstream.yml to also backup/restore non-workflow fork-specific files

The current sync-upstream.yml only backs up fork-specific files under .github/workflows/. It should also handle:

  • .github/dependabot.yml
  • README.md (if it differs from upstream, i.e., contains fork badges)

Update the backup step to scan for ALL files that differ between the fork and upstream (not just .github/workflows/), using a configurable list or by diffing. A simple approach: maintain a file called .github/fork-specific-files.txt that lists the paths of fork-specific files (one per line). The sync workflow reads this file to know what to back up and restore.

Create .github/fork-specific-files.txt with:

.github/dependabot.yml
.github/workflows/codacy.yml
.github/workflows/codeql.yml
.github/workflows/maven.yml
.github/workflows/rebase.yml
.github/workflows/rebase-upstream.yml
.github/workflows/sync-upstream.yml
.github/workflows/create-upstream-pr.yml
.github/fork-specific-files.txt
README.md

Update sync-upstream.yml to use this file list instead of only scanning .github/workflows/.

Summary of Files to Create/Modify

  1. CREATE .github/workflows/create-upstream-pr.yml — new workflow
  2. MODIFY .github/workflows/rebase-upstream.yml — fix the rebase logic to use merge-base
  3. MODIFY .github/workflows/sync-upstream.yml — use fork-specific-files.txt for backup/restore
  4. CREATE .github/fork-specific-files.txt — list of fork-only files

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…improve sync-upstream

Co-authored-by: carstenartur <3164220+carstenartur@users.noreply.github.com>
Copilot AI changed the title [WIP] Add create upstream PR workflow for clean branches Fix fork divergence: add clean upstream PR workflow, fix rebase merge-base bug, generalize sync backup Feb 28, 2026
@carstenartur carstenartur marked this pull request as ready for review February 28, 2026 10:58
@carstenartur carstenartur merged commit e0238e6 into master Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants