Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: logformatter: link to correct PR base #23081

Merged

Conversation

edsantiago
Copy link
Member

@edsantiago edsantiago commented Jun 24, 2024

Two enormous misunderstandings:

  1. $CIRRUS_BASE_SHA is worthless. I thought it was, you know, the BASE SHA of the current commit, but (as best I can tell) it seems to be the SHA of the most recent commit on the destination branch. Cirrus docs are unhelpful. Anyhow, it's clearly not anything useful. Stop using it.

  2. $EPOCH_TEST_COMMIT is closer to what we want. It is defined in Makefile as the git merge-base. But for unknown reasons it was being clobbered in CI scripts, and it doesn't seem to work in all contexts, so, eliminate it from CI setup scripts. Leave it only in Makefile.

This leaves us with no option other than defining our own merge-base variable, PR_BASE_SHA. Do so and pass it along to rootless jobs.

Signed-off-by: Ed Santiago [email protected]

None

@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 24, 2024
@edsantiago
Copy link
Member Author

If this works, logformatted logs will link to 7b4f6ec as base, but CIRRUS_BASE_SHA will be fe18a87.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I assume this is in reply to #23068 (comment)
I also couldn't find anything documenting the purpose of CIRRUS_BASE_SHA.

I do wonder now should this not remove all CIRRUS_BASE_SHA references from lib.sh, per comment it was only set there to use it for EPOCH_TEST_COMMIT. Now that you removed that part I would assume we can remove the entire block there.

@edsantiago
Copy link
Member Author

Yes, thank you for reporting that.

I left CIRRUS_BASE_SHA because of the comment mentioning hack/get_ci_vm.sh. I don't know how that script works, and looking into it would require poking down a cascade of other repos. @cevich do you know what that comment is in reference to? Can you check if it's safe to completely eliminate CIRRUS_BASE_SHA?

@cevich
Copy link
Member

cevich commented Jun 24, 2024

I also couldn't find anything documenting the purpose of CIRRUS_BASE_SHA.

This value is only relevant for PRs. It's simply the sha of the branch the PR derived from. Since it's coming from the CI system, I considered it more authoritative than a value derived from git merge-base. As Paul mentions, it was used to set EPOCH_TEST_COMMIT because the (ancient) vbatts/gitvalidate tool we depended on was sensitive to that envar. I think there were a few places in the Makefile that also relied on an accurate EPOCH_TEST_COMMIT in a CI context.

I do wonder now should this not remove all CIRRUS_BASE_SHA references from lib.sh, per
comment it was only set there to use it for EPOCH_TEST_COMMIT.

As-is today, it wouldn't surprise me if many/most of these original use-cases have been moved/removed.

comment mentioning hack/get_ci_vm.sh

This isn't too hard to get your head around 😉

  1. c/automation hosts cirrus-ci_env, a python mini-implementation of a .cirrus.yml parser. It's only job is to extract all envars given a task name (including from a matrix).
  2. c/automation_images hosts get_ci_vm, this bundles cirrus-ci_env with an entry-point script.
  3. When a user runs hack/get_ci_vm.sh there are NO magic variables set. The hack script simply runs what is defined as the repo's "setup" steps. You can see/check what it does by examining the --setup condition.
  4. The only really complex part of c/automation_images get_ci_vm is the VM provisioning/orchestration.
  5. Each repo's "setup" steps are then required to define any missing/needed CIRRUS_* variables. And ensure the values persist (normally by adding them into /etc/ci_environment on the VM after it's provisioned.

So, the hack script only indirectly depends on any CIRRUS_* values insofar as the rest of the repo. scripts require them. Even better, because the hack script uses the current (local) repo state, you can easily test "setup" changes by just using the tool to run a few tasks 😄

Edit: Opened containers/automation_images#362

@edsantiago
Copy link
Member Author

[CIRRUS_BASE_SHA is] simply the sha of the branch the PR derived from.

It actually isn't: that's the purpose of this PR.

I interpret your comments as meaning "we don't care about CIRRUS_BASE_SHA" so I've removed it and re-pushed

@cevich
Copy link
Member

cevich commented Jun 24, 2024

it seems to be the SHA of the most recent commit on the
destination branch.

I'm misunderstanding something. Are you saying if a branch has commits 1, 2, and 3. CI running on a PR based on commit 2, will have CIRRUS_BASE_SHA set to commit 3?!?!?!?!

If so, that is indeed astonishing. To the point I would call it a Cirrus-CI bug.

meaning "we don't care about CIRRUS_BASE_SHA"

We only care about it to the extent that all of the other CI scripts and Makefile and whatnot care about it. The hack/get_ci_vm.sh script should be "mostly" decoupled from the specifics of CI in each repo. The only places where tight coupling can occur is in the --setup conditional block of the script.

That said, the CI simulation isn't perfect and interactions w/ lib.sh can be complex/unexpected. So it would be good to manually test a few tasks w/ the hack script to make sure CIRRUS_BASE_SHA really doesn't matter. I would suggest a 'build' and 'validate' tasks test run.

@edsantiago
Copy link
Member Author

As I'm understanding it, CIRRUS_BASE_SHA means "whatever is the most recent commit on main at the time you submit your PR, no matter what your PR is branched from":

A---B---C---D---E    <--- HEAD on main
         \
          PR    <--- CIRRUS_BASE_SHA = E

Then if time passes, three more PRs merge into main (F, G, H), and you repush your PR without rebasing, CIRRUS_BASE_SHA = H

@cevich
Copy link
Member

cevich commented Jun 24, 2024

CIRRUS_BASE_SHA = H

Ewww, yes, that is indeed broken. I would/have expected it to remain "C" in your example, regardless of what happens on main.

I'll report this to Cirrus as a bug and if they can't/won't fix it, recommend they update the docs instead. "Base SHA if current build was triggered by a PR" isn't specific enough regardless.

Edit: Filed cirruslabs/cirrus-ci-docs#1279

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@edsantiago edsantiago marked this pull request as draft June 24, 2024 16:17
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 24, 2024
@edsantiago
Copy link
Member Author

Thanks, but it's not working! It's skipping all CI tests. EPOCH_TEST_COMMIT must be wonky but I can't find an envariable dump in which to see what it is. Marking draft to avoid merging this yet.

@edsantiago edsantiago force-pushed the logformatter-base-sha branch 7 times, most recently from 2e8ae3d to 39ebb5e Compare June 25, 2024 11:29
Two enormous misunderstandings:

  1) $CIRRUS_BASE_SHA is worthless. I thought it was, you know,
     the BASE SHA of the current commit, but (as best I can tell)
     it seems to be the SHA of the most recent commit on the
     destination branch. Cirrus docs are unhelpful. Anyhow,
     it's clearly not anything useful. Stop using it.

  2) $EPOCH_TEST_COMMIT is closer to what we want. It is
     defined in Makefile as the git merge-base. But for unknown
     reasons it was being clobbered in CI scripts, and it
     doesn't seem to work in all contexts, so, eliminate it
     from CI setup scripts. Leave it only in Makefile.

This leaves us with no option other than defining our own
merge-base variable, PR_BASE_SHA. Do so and pass it along
to rootless jobs.

Signed-off-by: Ed Santiago <[email protected]>
@edsantiago edsantiago marked this pull request as ready for review June 25, 2024 14:25
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2024
Copy link
Member Author

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After much hairpulling, I believe this is working and ready for review. Sample log, scroll to top and note the Base commit link.

@@ -276,7 +276,6 @@ help: ## (Default) Print listing of key targets with their descriptions

.PHONY: lint
lint: golangci-lint
@echo "Linting vs commit '$(call err_if_empty,EPOCH_TEST_COMMIT)'"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As best I can tell, PRE_COMMIT does not use EPOCH_TEST_COMMIT. This echo was added in #5066, back in February 2020, but I think it was a mistake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure pre-cmmit doesn't. I know vbatts/gitvalidate does use it though (in case that matters here).

base=$(git merge-base $DEST_BRANCH $head)
# shellcheck disable=SC2154
base=$PR_BASE_SHA
echo "_bail_if_test_can_be_skipped: head=$head base=$base"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debugging printf. Does anyone mind if I leave it in for a little while?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine IMO

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

openshift-ci bot commented Jun 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cevich
Copy link
Member

cevich commented Jun 25, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 25, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 4220ee0 into containers:main Jun 25, 2024
90 checks passed
@edsantiago edsantiago deleted the logformatter-base-sha branch June 25, 2024 18:29
edsantiago added a commit to edsantiago/libpod that referenced this pull request Jun 28, 2024
Followup to containers#23081, which broke this test on nightly cron.

Signed-off-by: Ed Santiago <[email protected]>
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Sep 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants