Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pr_changed_files script to detect changed files in a PR #148

Merged
merged 28 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
76698ea
Add `pr_changed_files` script to detect if files changed in a PR
iangmaia Feb 4, 2025
ea14583
Fix shellsheck warning
iangmaia Feb 4, 2025
1be6a12
Update `CHANGELOG.md`
iangmaia Feb 4, 2025
b528c2e
DRY error parameter validation message
iangmaia Feb 5, 2025
61741ea
Apply suggestions from code review
iangmaia Feb 5, 2025
0790ee2
DRY arguments parsing code
iangmaia Feb 5, 2025
80894ec
Fix mode parsing
iangmaia Feb 5, 2025
d7e9173
Add tests for `pr_changed_files`
iangmaia Feb 5, 2025
a6296b0
Improve handling of filenames with spaces / multiple files in `pr_cha…
iangmaia Feb 5, 2025
cedbacc
Improve shellcheck comment
iangmaia Feb 5, 2025
98fd705
Fix for running tests on CI
iangmaia Feb 5, 2025
f33a25b
Update documented examples
iangmaia Feb 5, 2025
1989402
Update to get list of changed files as an array
iangmaia Feb 5, 2025
a8294c7
Fix code to unescape file names
iangmaia Feb 6, 2025
8ba279f
Rename arguments `--includes` to `--any-match` and `--only-in` to `--…
iangmaia Feb 6, 2025
ff689a2
Add `git fetch origin` to make sure the branch is fetched in the agent
iangmaia Feb 6, 2025
15e7a3c
Extract file pattern match to a function
iangmaia Feb 6, 2025
45fefcf
Add more examples and details in the command documentation
iangmaia Feb 6, 2025
bd983b8
Improve file pattern arguments handling & tests
iangmaia Feb 6, 2025
5617e47
Update documentation (suggestions from code review)
iangmaia Feb 7, 2025
0e208ef
Update bin/pr_changed_files
iangmaia Feb 7, 2025
2277ed9
Apply suggestions from code review
iangmaia Feb 10, 2025
13d4dfb
Consistently use the order 1. "all-match" 2. "any-match"
iangmaia Feb 10, 2025
262a0d2
Print result for succeeded tests and improve empty states tests
iangmaia Feb 10, 2025
d794f96
Remove post-process of the quotes and escaped characters in the files…
iangmaia Feb 11, 2025
4cc76b2
Fix escaping in all_match_patterns tests
iangmaia Feb 11, 2025
40cac25
Improve tests with special characters
AliSoftware Feb 11, 2025
ee11f58
Use single quotes on other tests too
AliSoftware Feb 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,3 +92,29 @@ steps:
notify:
- github_commit_status:
context: "install_swiftpm_dependencies Tests: Xcode Workspace and Project - Explicit"

Copy link
Contributor Author

@iangmaia iangmaia Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mokagio and @AliSoftware !

Gathering everything in a comment to make sure I'm not missing anything (and I'll continue tomorrow).

I've implemented a few updates, and the most relevant change was to add some tests.

These are my next steps in the PR:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AliSoftware @mokagio
I've updated all the items above (I've added the commit in which each change was implemented).

Now I've been trying to use a8-ci-toolkit-buildkite-plugin by pointing it directly to this branch in another repo, to make sure the general solution works as expected. See woocommerce/woocommerce-ios#15086.

I'm facing an issue (that is a bit obvious, in retrospect) where the pr_changed_files command we're trying to run on .buildkite/shared-pipeline-vars isn't installed yet, as the version is defined in the .buildkite/shared-pipeline-vars itself (see the reported error).

One simple alternative I can think of: call buildkite-agent plugin install $CI_TOOLKIT directly in shared-pipeline-vars, but that would of course slow down the upload a bit 🤔
What do you think? Any other ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm facing an issue (that is a bit obvious, in retrospect) where the pr_changed_files command we're trying to run on .buildkite/shared-pipeline-vars isn't installed yet, as the version is defined in the .buildkite/shared-pipeline-vars itself

Oh shoot, didn't think about that… right…

One simple alternative I can think of: call buildkite-agent plugin install $CI_TOOLKIT directly in shared-pipeline-vars, but that would of course slow down the upload a bit 🤔

Yeah. Tricky.

1️⃣ buildkite-agent plugin install

I guess it depends how much time it will add.

It will only add that time for pipelines that do want to use pr_changed_files and thus need to install the plugin in the uploader agent—without affecting other pipelines that don't use pr_changed_files in their shared-pipeline-vars and won't call buildkite-agent plugin install. And given the goal of this trick is to be able to skip e.g. UI Test jobs on builds where they're not needed, if the installation of the plugin takes only 1–2 seconds, then the time lost will be balanced tenfold with the benefit to avoid the build time of many skippable jobs…

2️⃣ Another option would be to download just that binary ourselves (curl https://raw.githubusercontent.com/Automattic/a8c-ci-toolkit-buildkite-plugin/refs/heads/trunk/bin/pr_changed_files + some chmod +x…) but at that point that would kinda be reinventing the wheel compared to what buildkite-agent plugin install does (sure it'd reduce the download to only that specific binary instead of installing the whole plugin, but not sure there'd be much of a difference, those bin/ scripts in our plugin are not that big)

3️⃣ Bootstrap script on the uploader agent

The last option I could think of would be to install the script directly into the agents at the time we provision and deploy them.

At that point that last option 3️⃣ feels a bit too convoluted just to avoid the simpler call to buildkite-agent plugin install… but it's still a possibility to be aware of…

Copy link
Contributor Author

@iangmaia iangmaia Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out something like buildkite-agent plugin install was how I remembered we did to install it in the agents, but in fact we just download them during the agent provisioning 😓
I don't think there's an easy command to dynamically install plugins, only use them in pipeline jobs.

Which makes me think of two other alternatives:
4️⃣

  - label: "Check Changes"
    key: check-changes
    command: |
      SKIP_UI_TESTS=$(pr_changed_files --all-match "*.md")
      buildkite-agent meta-data set "SKIP_UI_TESTS" "$SKIP_UI_TESTS"
    plugins: [$CI_TOOLKIT]

And make the more expensive jobs depend on it 🤔 (assuming I can check then SKIP_UI_TESTS in the pipeline, but I haven't tested it).

And 5️⃣
Just use it on a case by case basis either in the script itself (e.g. .buildkite/commands/run-ui-tests.sh or equivalent) or in the command in yml attribute.


3️⃣ even if it is a bit more complex to build, it's easier to use in the end. Alternatively, 4️⃣ or 5️⃣ could be a simple starting point to try the command.

Copy link
Contributor

@AliSoftware AliSoftware Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4️⃣ wouldn't work, at least not as pitched of using SKIP_UI_TESTS metadata in the if: attribute of UI Tests step.

image

But alternatives of that include:

  • 4️⃣ 🅰️ either use buildkite-agent pipeline upload to upload the UITests steps conditionally (after having extracted their steps in a dedicated *.yml file)
  • 4️⃣ 🅱️ or keep buildkite-agent meta-data set and let the UITests steps start a job in all cases (instead of the creation of the job itself being skipped by if: attribute), and check the metadata value at the start of the job's script. Which would mean we'd still have the UITests job request an agent and start the build, but it'd exit early.

🅰️ Would have the benefit of preventing to create the UITests job to begin with, reducing the need to assign a CI agent for nothing. But that also means it wouldn't generate the GitHub CI check notification, and so that the "Run UI Tests" check would never be marked as done in the PR. If that check is marked as required for the PR to be merged, that is going to be an issue.

🅱️ Has the drawback of requesting a CI agent even if it will exit early and have nothing to build, but has the benefit of ensuring the CI check notification will be sent to the PR if that check is required.


But if we go with 4️⃣ 🅱️ at that point we might as well just run the pr_changed_files directly in the UI Tests command (instead of running it in a separate step, set it as metadata, and read the metadata in the UI Tests command…), aka your option 5️⃣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test with 4️⃣ 🅱️ / 5️⃣: woocommerce/woocommerce-ios#15086 / https://buildkite.com/automattic/woocommerce-ios/builds/28092#0194e1ba-9cf5-4ccc-8cfd-9bd73e0b1494/286-288.
It's pretty straightforward, but another downside as you mention is the fact that it still uses some CI resources.

In any case, I believe we can move forward with this PR, as the final solution on how this is going to be used should be independent of the implementation.

- group: ":github: pr_changed_files Tests"
Copy link
Contributor

@AliSoftware AliSoftware Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 What's a bit funny and meta is that once this new pr_changed_files helper lands in trunk, we could consider use it on itself to only run its unit tests if the script itself has changed, i.e.

Suggested change
- group: ":github: pr_changed_files Tests"
- group: ":github: pr_changed_files Tests"
if: build.env('RUN_PR_CHANGED_FILES_TESTS')

After adding RUN_PR_CHANGED_FILES_TESTS=$(pr_changed_files --includes "bin/pr_changed_files" "tests/pr_changed_files/*") to the shared-pipeline-vars of this repo 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES! I was thinking exactly the same 😄

steps:
- label: ":github: pr_changed_files Tests - Basic Changes"
command: tests/pr_changed_files/test_basic_changes.sh
notify:
- github_commit_status:
context: "pr_changed_files Tests: Basic Changes"

- label: ":github: pr_changed_files Tests - Any Match"
command: tests/pr_changed_files/test_any_match_patterns.sh
notify:
- github_commit_status:
context: "pr_changed_files Tests: Any Match"

- label: ":github: pr_changed_files Tests - All Match"
command: tests/pr_changed_files/test_all_match_patterns.sh
notify:
- github_commit_status:
context: "pr_changed_files Tests: All Match"

- label: ":github: pr_changed_files Tests - Edge Cases"
command: tests/pr_changed_files/test_edge_cases.sh
notify:
- github_commit_status:
context: "pr_changed_files Tests: Edge Cases"
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ _None._

### New Features

_None._
- Add `pr_changed_files` command to detect changes made in a Pull Request [#148]

### Bug Fixes

Expand Down
141 changes: 141 additions & 0 deletions bin/pr_changed_files
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
#!/bin/bash -eu

# This script checks if files are changed in a PR.
#
# Usage:
# pr_changed_files # Check if any files changed
# pr_changed_files --all-match <file-patterns...> # Check if changes are limited to given patterns
# pr_changed_files --any-match <file-patterns...> # Check if changes include files matching patterns
#
# Behavior:
# With no arguments:
# Returns "true" if the PR contains any changes, "false" otherwise
#
# With --all-match:
# Returns "true" if ALL changed files match AT LEAST ONE of the patterns
# Returns "false" if ANY changed file doesn't match ANY pattern
# Note: Will return "true" even if not all patterns are matched by the changed files.
# This mode is especially useful to check if the PR _only_ touches a particular subset of files/folders (but nothing else)
#
# With --any-match:
# Returns "true" if ANY changed file matches ANY of the patterns
# Returns "false" if NONE of the changed files match ANY pattern
# Note: Will return "true" even if the PR includes other files not matching the patterns.
# This mode is especially useful to check if the PR _includes_ (aka _contains at least_) particular files/folders
#
# Examples with expected outputs:
# # Check if any files changed, returning "true" if PR has changes, "false" otherwise
# $ pr_changed_files
#
# # Check if only documentation files changed (to skip UI tests for example)
# $ pr_changed_files --all-match "*.md" "docs/*"
# → "true" if PR only changes `docs/guide.md` and `README.md`
# → "true" if PR only changes `docs/image.png` (not all patterns need to match, ok if no *.md)
# → "false" if PR changes `docs/guide.md` and `src/main.swift` (ALL files need to match at least one pattern)
#
# # Check if any Swift files changed (to decide if we should run SwiftLint)
# $ pr_changed_files --any-match "*.swift" ".swiftlint.yml"
# → "true" if PR changes `src/main.swift` and `README.md` (AT LEAST one file matches one of the patterns)
# → "true" if PR changes `.swiftlint.yml`
# → "false" if PR only changes `README.md` (none of files match any of the patterns)
#
# Returns:
# Prints "true" if the condition is met, "false" otherwise
# Exits with code 0 regardless of if the condition was met or not, to allow easy variable assignment.
# Only exits with non-zero if the command invocation itself was incorrect (called outside of a PR context, incorrect arguments…)

if [[ ! "${BUILDKITE_PULL_REQUEST:-invalid}" =~ ^[0-9]+$ ]]; then
echo "Error: this tool can only be called from a Buildkite PR job" >&2
exit 1
fi

# Ensure we have the base branch locally
git fetch origin "$BUILDKITE_PULL_REQUEST_BASE_BRANCH" >/dev/null 2>&1 || {
echo "Error: failed to fetch base branch '$BUILDKITE_PULL_REQUEST_BASE_BRANCH'" >&2
exit 1
}

mode=""
patterns=()

# Define error message for mutually exclusive options
EXCLUSIVE_OPTIONS_ERROR="Error: either specify --all-match or --any-match; cannot specify both"

while [[ "$#" -gt 0 ]]; do
case $1 in
--all-match | --any-match)
if [[ -n "$mode" ]]; then
echo "$EXCLUSIVE_OPTIONS_ERROR" >&2
exit 1
fi
mode="${1#--}"
shift
# Check if there are any patterns after the flag
while [[ "$#" -gt 0 && "$1" != "--"* ]]; do
patterns+=("$1")
shift
done
if [[ "${#patterns[@]}" -eq 0 ]]; then
echo "Error: must specify at least one file pattern" >&2
exit 1
fi
;;
--*)
echo "Error: unknown option $1" >&2
exit 1
;;
*)
echo "Error: unexpected argument $1" >&2
exit 1
;;
esac
done

# Get list of changed files as an array
changed_files=()
while IFS= read -r -d '' file; do
changed_files+=("$file")
done < <(git --no-pager diff --name-only -z --merge-base "$BUILDKITE_PULL_REQUEST_BASE_BRANCH" HEAD | sort)

if [[ -z "$mode" ]]; then
# No arguments = any change
if [[ ${#changed_files[@]} -gt 0 ]]; then
echo "true"
else
echo "false"
fi
exit 0
fi

# Returns 0 if the file matches any of the patterns, 1 otherwise
file_matches_any_pattern() {
local file="$1"
shift
for pattern in "$@"; do
# shellcheck disable=SC2053 # We don't quote the rhs in the condition on the next line because we want to interpret pattern as a glob pattern
if [[ "$file" == ${pattern} ]]; then
return 0
fi
done
return 1
}

if [[ "$mode" == "all-match" ]]; then
# Check if all changed files match at least one pattern
for file in "${changed_files[@]}"; do
if ! file_matches_any_pattern "$file" "${patterns[@]}"; then
echo "false"
exit 0
fi
done
echo "true"
elif [[ "$mode" == "any-match" ]]; then
# Check if any changed file matches any pattern
for file in "${changed_files[@]}"; do
if file_matches_any_pattern "$file" "${patterns[@]}"; then
echo "true"
exit 0
fi
done
echo "false"
fi
59 changes: 59 additions & 0 deletions tests/pr_changed_files/test_all_match_patterns.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/bin/bash -eu

set -o pipefail

source "$(dirname "${BASH_SOURCE[0]}")/test_helpers.sh"

echo "--- :git: Testing all-match pattern matching"

# Create test repository
repo_path=$(create_tmp_repo_dir)
trap 'cleanup_git_repo "$repo_path"' EXIT

# Set up environment variables
export BUILDKITE_PULL_REQUEST="123"
export BUILDKITE_PULL_REQUEST_BASE_BRANCH="base"

# Initialize the repository
init_test_repo "$repo_path"

# Create test files (using single quotes to avoid special chars being interpreted by the shell)
mkdir -p docs src/swift
echo "doc1" > 'docs/read me.md'
echo "doc2" > 'docs/guide with spaces.md'
echo "doc3" > 'docs/special\!@*#$chars.md'
git add .
git commit -m "Add doc files"

# [Test] All changes in docs
result=$(pr_changed_files --all-match "docs/*")
assert_output "true" "$result" "Should return true when all changes match patterns"

# [Test] All changes in docs with explicit patterns including spaces and special chars
# Note: we need to escape the '\` and `*` special chars in the pattern to match them literally instead of as special characters
result=$(pr_changed_files --all-match 'docs/read me.md' 'docs/guide with spaces.md' 'docs/special\\!@\*#$chars.md')
assert_output "true" "$result" "Should return true when all changes match patterns with spaces and special chars"

# [Test] All changes in docs with globbing patterns including spaces and special chars
result=$(pr_changed_files --all-match 'docs/read me.md' 'docs/guide with spaces.md' 'docs/special\\!*.md')
assert_output "true" "$result" "Should return true when all changes match patterns with spaces and special chars, even when using globbing"

# [Test] Changes outside pattern
echo "swift" > 'src/swift/main with spaces.swift'
echo "swift" > 'src/swift/special!\@#*$chars.swift'
git add .
git commit -m "Add swift file"

result=$(pr_changed_files --all-match "docs/*")
assert_output "false" "$result" "Should return false when changes exist outside patterns"

# [Test] Multiple patterns, all matching
# Note: we need to escape the '\` and `*` special chars in the pattern to match them literally instead of as special characters
result=$(pr_changed_files --all-match 'docs/*' 'src/swift/main with spaces.swift' 'src/swift/special\!\\@#\*$chars.swift')
assert_output "true" "$result" "Should return true when all changes match multiple patterns"

# [Test] Multiple patterns, all matching, including some using globbing
result=$(pr_changed_files --all-match 'docs/*' 'src/swift/main with spaces.swift' 'src/swift/special*chars.swift')
assert_output "true" "$result" "Should return true when all changes match multiple patterns, including some using globbing"

echo "✅ All-match pattern tests passed"
61 changes: 61 additions & 0 deletions tests/pr_changed_files/test_any_match_patterns.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/bin/bash -eu

set -o pipefail

source "$(dirname "${BASH_SOURCE[0]}")/test_helpers.sh"

echo "--- :git: Testing any-match pattern matching"

# Create test repository
repo_path=$(create_tmp_repo_dir)
trap 'cleanup_git_repo "$repo_path"' EXIT

# Set up environment variables
export BUILDKITE_PULL_REQUEST="123"
export BUILDKITE_PULL_REQUEST_BASE_BRANCH="base"

# Initialize the repository
init_test_repo "$repo_path"

# Create test files (using single quotes to avoid special chars being interpreted by the shell)
mkdir -p docs src/swift src/ruby
echo "doc" > 'docs/read me.md'
echo "doc" > 'docs/special!@*#$chars.md'
echo "swift" > 'src/swift/main.swift'
echo "ruby" > 'src/ruby/main.rb'
git add .
git commit -m "Add test files"

# [Test] Match specific extension
result=$(pr_changed_files --any-match '*.swift')
assert_output "true" "$result" "Should match .swift files"

# [Test] Match multiple patterns
result=$(pr_changed_files --any-match 'docs/*.md' '*.rb')
assert_output "true" "$result" "Should match multiple patterns"

# [Test] Match files with spaces and special characters
result=$(pr_changed_files --any-match 'docs/read me.md' 'docs/special!@\*#$chars.md')
assert_output "true" "$result" "Should match files with spaces and special characters"

# [Test] Match files with spaces and special characters, even when using globbing
result=$(pr_changed_files --any-match 'docs/read me.md' 'docs/special*chars.md')
assert_output "true" "$result" "Should match files with spaces and special characters, even when using globbing"

# [Test] No matches
result=$(pr_changed_files --any-match '*.js')
assert_output "false" "$result" "Should not match non-existent patterns"

# [Test] Directory pattern
result=$(pr_changed_files --any-match 'docs/*')
assert_output "true" "$result" "Should match directory patterns"

# [Test] Exact pattern matching
echo "swiftfile" > swiftfile.txt
git add swiftfile.txt
git commit -m "Add file with swift in name"

result=$(pr_changed_files --any-match '*.swift')
assert_output "true" "$result" "Should only match exact patterns"

echo "✅ Any-match pattern tests passed"
32 changes: 32 additions & 0 deletions tests/pr_changed_files/test_basic_changes.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash -eu

set -o pipefail

source "$(dirname "${BASH_SOURCE[0]}")/test_helpers.sh"

echo "--- :git: Testing basic changes detection"

# Create test repository
repo_path=$(create_tmp_repo_dir)
trap 'cleanup_git_repo "$repo_path"' EXIT

# Set up environment variables
export BUILDKITE_PULL_REQUEST="123"
export BUILDKITE_PULL_REQUEST_BASE_BRANCH="base"

# Initialize the repository
init_test_repo "$repo_path"

# [Test] No changes
result=$(pr_changed_files)
assert_output "false" "$result" "Should return false when no files changed"

# [Test] Single file change
echo "change" > new.txt
git add new.txt
git commit -m "Add new file"

result=$(pr_changed_files)
assert_output "true" "$result" "Should return true when files changed"

echo "✅ Basic changes tests passed"
Loading