feat(ci): restructure build-and-test workflow to reduce CI latency#303
feat(ci): restructure build-and-test workflow to reduce CI latency#303rstata wants to merge 5 commits intoairbnb:mainfrom
Conversation
|
Just pushed a fix: |
6939b80 to
b9c2785
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6939b80d02
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ] | ||
| has_wide = "true" if wide_combinations else "false" | ||
| with open(github_output, "a") as f: | ||
| f.write(f"coverage_os={coverage_os}\n") |
There was a problem hiding this comment.
Escape workflow outputs before writing user-provided strings
write_outputs emits coverage_os/coverage_java directly into $GITHUB_OUTPUT using raw key=value lines, but those values come from workflow inputs and are only type-validated as strings/ints. A JSON string containing a newline (for example via OS_INPUT) will be parsed successfully and then injected as extra output records, which can overwrite has_wide/wide_matrix and cause downstream jobs to run the wrong matrix or skip checks. Please reject control characters or use multiline-safe output encoding before writing these values.
Useful? React with 👍 / 👎.
The previous workflow had a build matrix followed by a test matrix, with tests unable to start until every build coordinate finished (GitHub Actions `needs:` is all-or-nothing across a matrix). Linting and coverage verification were also sequenced behind the full test matrix. The new structure shortens the critical path by: - validate-inputs: validates workflow dispatch inputs, runs the Python CI script unit tests, and computes the deep coordinate and wide matrix for downstream jobs. CI script tests run here because validate-inputs executes those scripts directly — a script regression would produce untrustworthy outputs, so testing them first halts everything before any downstream job acts on bad data. - build-deep: a single coordinate (first OS × first Java from the input lists, defaulting to ubuntu-latest × 17). Runs assembly and uploads a build artifact for downstream jobs to consume. - test-deep: same single coordinate, depends on build-deep. Runs tests, coverage report generation, and coverage threshold verification via the new `testWithCoverage` Gradle task. - docs-verify: same single coordinate, depends on build-deep, runs in parallel with test-deep. Generates Dokka API docs, builds the mkdocs site, and checks for broken links with djlint. - test-wide: the remaining OS × Java coordinates. Each wide coordinate runs `./gradlew test` and depends only on validate-inputs, so all wide jobs start immediately — they don't wait for build-deep. - detekt / ktlint: depend only on validate-inputs (source checkout is sufficient), so they also run in parallel with build-deep. - coverage-verification is eliminated as a separate job; coverage report generation and threshold checking are folded into test-deep via the new `testWithCoverage` Gradle task (test + testCodeCoverageReport + testCodeCoverageVerification). validate_inputs.py computes and exposes the deep coordinate (coverage_os, coverage_java) and the wide matrix (wide_matrix JSON array, has_wide flag) as GITHUB_OUTPUT values, so the workflow never hardcodes OS or Java version assumptions. `./gradlew clean` has been removed from build-deep. On a fresh CI runner the build directory is already empty, so clean is a no-op. More importantly, Gradle's build cache is content-addressed: cache keys are derived from all task inputs (source files, classpaths, JVM args), so a cached result is only reused when the inputs are identical to a prior passing run. We trust that mechanism to ensure correctness rather than forcing unconditional recompilation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b9c2785 to
3cbacbd
Compare
GRADLE_OPTS_EXTRA is not a variable that gradlew recognises; it was silently ignored, meaning the intended JVM options (-Dorg.gradle.parallel=false, -Dorg.gradle.caching=true, -Dorg.gradle.daemon=false) were never actually applied to any Gradle invocation. GRADLE_OPTS is the variable the Gradle wrapper reads. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3cbacbd to
849a264
Compare
Two changes: 1. Lift GRADLE_OPTS, BUILDSCAN_PUBLISH, and BUILDSCAN_AUTOACCEPTTERMS to a workflow-level env block. These three variables were copied verbatim into every Gradle job; the workflow-level block makes them implicit for all jobs without repetition. 2. Replace the detekt and ktlint matrix strategies with direct references to the coverage_os and coverage_java outputs from validate-inputs. The matrix expressions (fromJSON(format(...))) were computing exactly the same value — the first element of each input list — that validate-inputs already exposes. Removing the matrix also simplifies the job names and artifact names. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Setting no_cache=true on a manual workflow_dispatch disables the Gradle build cache for the entire run, forcing every task to execute from scratch rather than restoring results from a prior run. Implemented by making the -Dorg.gradle.caching flag in the workflow-level GRADLE_OPTS conditional: !inputs.no_cache evaluates to true normally (caching on) and to false when no_cache is checked (caching off). All ./gradlew invocations inherit this without any per-step changes. On push and pull_request triggers inputs.no_cache is unset, which GitHub Actions treats as falsy, so caching remains on by default. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
03b81cf to
06d1d8d
Compare
…orkflow Renames ci-manual-trigger.yml to release-full-test.yml, a purpose-built workflow for release validation that always runs with the Gradle build cache disabled (use_gradle_cache=false), forcing every task to execute from scratch across all three sub-workflows (build-and-test, standalone-demoapp-tests, bcv_api_check). Updates RELEASE-RUNBOOK.md to reference the new workflow name. All three sub-workflows gain a use_gradle_cache boolean input (default true, preserving normal CI behavior). release-full-test passes false to each of them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
06d1d8d to
4413589
Compare
The previous workflow had a build matrix followed by a test matrix, with
tests unable to start until every build coordinate finished (GitHub Actions
needs:is all-or-nothing across a matrix). Linting and coverageverification were also sequenced behind the full test matrix.
The new structure shortens the critical path by:
validate-inputs: validates workflow dispatch inputs, runs the Python CI
script unit tests, and computes the deep coordinate and wide matrix for
downstream jobs. CI script tests run here because validate-inputs
executes those scripts directly — a script regression would produce
untrustworthy outputs, so testing them first halts everything before any
downstream job acts on bad data.
build-deep: a single coordinate (first OS × first Java from the input
lists, defaulting to ubuntu-latest × 17). Runs assembly and uploads a
build artifact for downstream jobs to consume.
test-deep: same single coordinate, depends on build-deep. Runs tests,
coverage report generation, and coverage threshold verification via the
new
testWithCoverageGradle task.docs-verify: same single coordinate, depends on build-deep, runs in
parallel with test-deep. Generates Dokka API docs, builds the mkdocs
site, and checks for broken links with djlint.
test-wide: the remaining OS × Java coordinates. Each wide coordinate
runs
./gradlew testand depends only on validate-inputs, so all widejobs start immediately — they don't wait for build-deep.
detekt / ktlint: depend only on validate-inputs (source checkout is
sufficient), so they also run in parallel with build-deep.
coverage-verification is eliminated as a separate job; coverage report
generation and threshold checking are folded into test-deep via the new
testWithCoverageGradle task (test + testCodeCoverageReport +testCodeCoverageVerification).
validate_inputs.py computes and exposes the deep coordinate (coverage_os,
coverage_java) and the wide matrix (wide_matrix JSON array, has_wide flag)
as GITHUB_OUTPUT values, so the workflow never hardcodes OS or Java version
assumptions.
./gradlew cleanhas been removed from build-deep. On a fresh CI runnerthe build directory is already empty, so clean is a no-op. More
importantly, Gradle's build cache is content-addressed: cache keys are
derived from all task inputs (source files, classpaths, JVM args), so a
cached result is only reused when the inputs are identical to a prior
passing run. We trust that mechanism to ensure correctness rather than
forcing unconditional recompilation.