Skip to content

feat(ci): restructure build-and-test workflow to reduce CI latency#303

Closed
rstata wants to merge 5 commits intoairbnb:mainfrom
rstata:consolidate-tests
Closed

feat(ci): restructure build-and-test workflow to reduce CI latency#303
rstata wants to merge 5 commits intoairbnb:mainfrom
rstata:consolidate-tests

Conversation

@rstata
Copy link
Copy Markdown
Collaborator

@rstata rstata commented Mar 11, 2026

The previous workflow had a build matrix followed by a test matrix, with
tests unable to start until every build coordinate finished (GitHub Actions
needs: is all-or-nothing across a matrix). Linting and coverage
verification were also sequenced behind the full test matrix.

The new structure shortens the critical path by:

  • validate-inputs: validates workflow dispatch inputs, runs the Python CI
    script unit tests, and computes the deep coordinate and wide matrix for
    downstream jobs. CI script tests run here because validate-inputs
    executes those scripts directly — a script regression would produce
    untrustworthy outputs, so testing them first halts everything before any
    downstream job acts on bad data.

  • build-deep: a single coordinate (first OS × first Java from the input
    lists, defaulting to ubuntu-latest × 17). Runs assembly and uploads a
    build artifact for downstream jobs to consume.

  • test-deep: same single coordinate, depends on build-deep. Runs tests,
    coverage report generation, and coverage threshold verification via the
    new testWithCoverage Gradle task.

  • docs-verify: same single coordinate, depends on build-deep, runs in
    parallel with test-deep. Generates Dokka API docs, builds the mkdocs
    site, and checks for broken links with djlint.

  • test-wide: the remaining OS × Java coordinates. Each wide coordinate
    runs ./gradlew test and depends only on validate-inputs, so all wide
    jobs start immediately — they don't wait for build-deep.

  • detekt / ktlint: depend only on validate-inputs (source checkout is
    sufficient), so they also run in parallel with build-deep.

  • coverage-verification is eliminated as a separate job; coverage report
    generation and threshold checking are folded into test-deep via the new
    testWithCoverage Gradle task (test + testCodeCoverageReport +
    testCodeCoverageVerification).

validate_inputs.py computes and exposes the deep coordinate (coverage_os,
coverage_java) and the wide matrix (wide_matrix JSON array, has_wide flag)
as GITHUB_OUTPUT values, so the workflow never hardcodes OS or Java version
assumptions.

./gradlew clean has been removed from build-deep. On a fresh CI runner
the build directory is already empty, so clean is a no-op. More
importantly, Gradle's build cache is content-addressed: cache keys are
derived from all task inputs (source files, classpaths, JVM args), so a
cached result is only reused when the inputs are identical to a prior
passing run. We trust that mechanism to ensure correctness rather than
forcing unconditional recompilation.

@rstata
Copy link
Copy Markdown
Collaborator Author

rstata commented Mar 11, 2026

Just pushed a fix: GRADLE_OPTS_EXTRA was silently ignored by gradlew; renamed to GRADLE_OPTS throughout.

@rstata rstata force-pushed the consolidate-tests branch from 6939b80 to b9c2785 Compare March 11, 2026 23:26
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6939b80d02

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

]
has_wide = "true" if wide_combinations else "false"
with open(github_output, "a") as f:
f.write(f"coverage_os={coverage_os}\n")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Escape workflow outputs before writing user-provided strings

write_outputs emits coverage_os/coverage_java directly into $GITHUB_OUTPUT using raw key=value lines, but those values come from workflow inputs and are only type-validated as strings/ints. A JSON string containing a newline (for example via OS_INPUT) will be parsed successfully and then injected as extra output records, which can overwrite has_wide/wide_matrix and cause downstream jobs to run the wrong matrix or skip checks. Please reject control characters or use multiline-safe output encoding before writing these values.

Useful? React with 👍 / 👎.

The previous workflow had a build matrix followed by a test matrix, with
tests unable to start until every build coordinate finished (GitHub Actions
`needs:` is all-or-nothing across a matrix).  Linting and coverage
verification were also sequenced behind the full test matrix.

The new structure shortens the critical path by:

- validate-inputs: validates workflow dispatch inputs, runs the Python CI
  script unit tests, and computes the deep coordinate and wide matrix for
  downstream jobs.  CI script tests run here because validate-inputs
  executes those scripts directly — a script regression would produce
  untrustworthy outputs, so testing them first halts everything before any
  downstream job acts on bad data.

- build-deep: a single coordinate (first OS × first Java from the input
  lists, defaulting to ubuntu-latest × 17).  Runs assembly and uploads a
  build artifact for downstream jobs to consume.

- test-deep: same single coordinate, depends on build-deep.  Runs tests,
  coverage report generation, and coverage threshold verification via the
  new `testWithCoverage` Gradle task.

- docs-verify: same single coordinate, depends on build-deep, runs in
  parallel with test-deep.  Generates Dokka API docs, builds the mkdocs
  site, and checks for broken links with djlint.

- test-wide: the remaining OS × Java coordinates.  Each wide coordinate
  runs `./gradlew test` and depends only on validate-inputs, so all wide
  jobs start immediately — they don't wait for build-deep.

- detekt / ktlint: depend only on validate-inputs (source checkout is
  sufficient), so they also run in parallel with build-deep.

- coverage-verification is eliminated as a separate job; coverage report
  generation and threshold checking are folded into test-deep via the new
  `testWithCoverage` Gradle task (test + testCodeCoverageReport +
  testCodeCoverageVerification).

validate_inputs.py computes and exposes the deep coordinate (coverage_os,
coverage_java) and the wide matrix (wide_matrix JSON array, has_wide flag)
as GITHUB_OUTPUT values, so the workflow never hardcodes OS or Java version
assumptions.

`./gradlew clean` has been removed from build-deep.  On a fresh CI runner
the build directory is already empty, so clean is a no-op.  More
importantly, Gradle's build cache is content-addressed: cache keys are
derived from all task inputs (source files, classpaths, JVM args), so a
cached result is only reused when the inputs are identical to a prior
passing run.  We trust that mechanism to ensure correctness rather than
forcing unconditional recompilation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rstata rstata force-pushed the consolidate-tests branch from b9c2785 to 3cbacbd Compare March 11, 2026 23:30
GRADLE_OPTS_EXTRA is not a variable that gradlew recognises; it was
silently ignored, meaning the intended JVM options
(-Dorg.gradle.parallel=false, -Dorg.gradle.caching=true,
-Dorg.gradle.daemon=false) were never actually applied to any Gradle
invocation.  GRADLE_OPTS is the variable the Gradle wrapper reads.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rstata rstata force-pushed the consolidate-tests branch from 3cbacbd to 849a264 Compare March 11, 2026 23:32
Raymie Stata and others added 2 commits March 11, 2026 23:35
Two changes:

1. Lift GRADLE_OPTS, BUILDSCAN_PUBLISH, and BUILDSCAN_AUTOACCEPTTERMS to
   a workflow-level env block.  These three variables were copied verbatim
   into every Gradle job; the workflow-level block makes them implicit for
   all jobs without repetition.

2. Replace the detekt and ktlint matrix strategies with direct references
   to the coverage_os and coverage_java outputs from validate-inputs.  The
   matrix expressions (fromJSON(format(...))) were computing exactly the
   same value — the first element of each input list — that validate-inputs
   already exposes.  Removing the matrix also simplifies the job names and
   artifact names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Setting no_cache=true on a manual workflow_dispatch disables the Gradle
build cache for the entire run, forcing every task to execute from scratch
rather than restoring results from a prior run.

Implemented by making the -Dorg.gradle.caching flag in the workflow-level
GRADLE_OPTS conditional: !inputs.no_cache evaluates to true normally
(caching on) and to false when no_cache is checked (caching off).  All
./gradlew invocations inherit this without any per-step changes.  On
push and pull_request triggers inputs.no_cache is unset, which GitHub
Actions treats as falsy, so caching remains on by default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rstata rstata force-pushed the consolidate-tests branch from 03b81cf to 06d1d8d Compare March 11, 2026 23:53
…orkflow

Renames ci-manual-trigger.yml to release-full-test.yml, a purpose-built
workflow for release validation that always runs with the Gradle build
cache disabled (use_gradle_cache=false), forcing every task to execute
from scratch across all three sub-workflows (build-and-test,
standalone-demoapp-tests, bcv_api_check). Updates RELEASE-RUNBOOK.md
to reference the new workflow name.

All three sub-workflows gain a use_gradle_cache boolean input (default
true, preserving normal CI behavior). release-full-test passes false
to each of them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rstata rstata force-pushed the consolidate-tests branch from 06d1d8d to 4413589 Compare March 11, 2026 23:56
@rstata rstata closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant