[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #1819
Closed
Replies: 2 comments
-
|
This discussion was automatically closed because it expired on 2026-04-15T23:11:56.184Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
-
|
🔮 The ancient spirits stir, and the smoke-test agent has walked this threshold.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Current CI/CD Pipeline Status
The repository has a mature, multi-layered CI/CD pipeline with strong coverage across build verification, testing, security scanning, and AI-powered agentic workflows. All compiled workflows are in healthy status.
Pipeline Summary
build.yml,lint.ymltest-integration.yml(TypeScript type check)test-coverage.ymltest-integration-suite.yml(5 jobs),test-chroot.yml(4 jobs)test-examples.ymlcodeql.yml,dependency-audit.yml,security-guard.md(AI)link-check.yml,deploy-docs.yml,docs-preview.ymlperformance-monitor.ymlpr-title.ymlsmoke-claude.md,smoke-copilot.md,smoke-codex.md,smoke-chroot.md,smoke-services.mdbuild-test.mdRecent run health: Build Verification, Integration Tests, and Test Coverage all show
successconclusions across the last 10 observed runs (April 8, 2026).✅ Existing Quality Gates
On Every PR
tsc --noEmitstrict type checking/procfilesystem, edge casessecurity-extended,security-and-qualityqueriesnpm audit --audit-level=highfor main + docs-site packages with SARIF upload to Security tabScheduled / Async Quality
🔍 Identified Gaps
🔴 High Priority
1. Low Unit Test Coverage with Low Thresholds
The critical business logic files have very low unit test coverage:
docker-manager.ts: 18% statements, 4% functions (250 statements, 25 functions mostly uncovered)cli.ts: 0% coverage (entry point entirely untested at unit level)These are the most complex files orchestrating the entire container lifecycle, yet they're almost entirely untested at the unit level. Integration tests provide some coverage but don't count toward these metrics.
Risk: A regression in
docker-manager.ts(e.g., wrong Docker Compose config generation, incorrect network topology) can pass all unit tests undetected.2. No Container Security Scanning (Image Vulnerabilities)
There is no workflow that scans the built Docker images (
squid,agent,api-proxy) for OS-level CVEs or package vulnerabilities (e.g., via Trivy, Grype, or Docker Scout). Thedependency-audit.ymlonly covers npm packages.Risk: Base images (
ubuntu/squid:latest,ubuntu:22.04, Node.js) may contain critical OS CVEs that are not detected before being pushed to GHCR.3. Performance Benchmarks Not Run on PRs
The
performance-monitor.ymlis weekly-only and never runs on pull requests. Container startup latency regressions introduced by a PR won't be caught until the weekly run.Risk: A change to Docker Compose generation, healthcheck configuration, or network setup could silently degrade startup performance by 2-3x before being detected.
4. Integration Test Suite Missing CI Workflow Coverage
Per
docs/INTEGRATION-TESTS.md, the following integration test categories have no CI workflow running them on PRs:test-chroot.ymlruns these in some casestest-integration-suite.ymlcovers someMapping from the docs shows some tests may fall through gaps between the
--testPathPatternsfilters intest-integration-suite.yml.🟡 Medium Priority
5. No Required Status Checks Policy Documented
The workflows exist but there's no documented or enforced branch protection configuration requiring specific checks to pass before merge. If branch protection is not configured with required checks, a failing
build.ymlcould still allow a merge.6. Coverage Thresholds Are Below Meaningful Levels
Current thresholds (38%/30%/35%/38%) are set to match the current baseline rather than aspirational targets. The
test-coverage-improveragentic workflow runs weekly to improve this, but there is no roadmap or enforced timeline for reaching a minimum acceptable coverage (typically 60-80% for production code).7. No Mutation Testing
The test suite doesn't include mutation testing (e.g., Stryker Mutator for TypeScript). This means tests may achieve coverage percentages while not actually asserting on important logic variations.
8. Smoke Tests Are Not Required for PR Merge (Reaction-Triggered)
smoke-claude.md,smoke-copilot.md, andsmoke-codex.mdrun on PR events, but are also reaction-gated (reaction: heart/eyes/hooray). If reactions are the primary trigger, these expensive end-to-end tests may not always run on every PR, leaving a gap for regressions in agent-specific behavior.9. No Docker Compose Schema Validation
src/docker-manager.tsgenerates Docker Compose YAML dynamically. There's no step that validates the generateddocker-compose.ymlagainst the Docker Compose schema (viadocker compose configor JSON Schema). A malformed config would only be caught at runtime.10. No Formatting Check (Prettier)
build.ymlandlint.ymlrun ESLint but there's no Prettier formatting check. Code style inconsistencies accumulate over time and create noisy diffs.🟢 Low Priority
11. No SBOM Generation
No Software Bill of Materials (SBOM) is generated or attached to releases. For a security-critical tool like AWF, an SBOM would aid supply chain transparency.
12. Workflow Action Pinning Inconsistency
performance-monitor.ymlusesactions/checkout@v4(floating tag), while all other workflows use pinned SHAs. This creates a minor supply chain risk for that specific workflow.13. No Changelog Validation on PRs
While
pr-title.ymlenforces conventional commit format, there's no check thatCHANGELOG.mdor release notes are updated for feature/fix PRs.14. Link Check Only Triggered on MD Changes
link-check.ymlonly runs when**/*.mdfiles change. A refactoring that renames source files could silently break documentation links without triggering a link check.15. No License Header / REUSE Compliance Check
There's no automated check that new source files include proper license headers, which can be important for an open-source project.
📋 Actionable Recommendations
HIGH: Unit Test Coverage for Core Logic
Issue:
docker-manager.tsandcli.tshave near-zero unit test coverage.Solution: Add unit tests using Jest mocks for
execa(already a dependency) to testgenerateDockerCompose(),startContainers(),stopContainers(), and signal handling without needing Docker. Thetest-coverage-improveragentic workflow can be pointed specifically at these files.Complexity: Medium (mocking
execaand filesystem operations requires careful setup)Impact: High — would catch config generation regressions, wrong network addresses, missing env vars
HIGH: Container Image CVE Scanning
Issue: No vulnerability scanning of built Docker images.
Solution: Add a
container-security-scan.ymlworkflow that runstrivy imageordocker scout cvesoncontainers/squid/,containers/agent/,containers/api-proxy/images and uploads SARIF to the Security tab. Can run on PR + weekly schedule.Complexity: Low
Impact: High — catches OS-level CVEs before GHCR publish
HIGH: Add PRs to Performance Monitor Trigger
Issue: Performance regressions only caught weekly.
Solution: Add
pull_requesttrigger toperformance-monitor.ymlwith a reduced iteration count (e.g., 2 vs 5), posting benchmark delta as a PR comment. Usecontinue-on-error: trueto avoid blocking merges on transient performance variability.Complexity: Low
Impact: High — immediate feedback on latency regressions
MEDIUM: Raise Coverage Thresholds Incrementally
Issue: 38%/30% thresholds are too low to catch real regressions.
Solution: Increase thresholds by 5% per quarter via the
test-coverage-improverworkflow. Set a roadmap target of 60% statements / 50% branches as a 6-month goal, enforced injest.config.js.Complexity: Low (config change + test writing)
Impact: Medium — raises confidence that new tests actually cover production paths
MEDIUM: Docker Compose Config Validation
Issue: Generated
docker-compose.ymlis never validated before container start.Solution: Add a test in the unit test suite that calls
generateDockerCompose()and pipes the output throughdocker compose config --quietto validate schema. This can run in the existingtest-coverage.ymljob.Complexity: Low
Impact: Medium — catches malformed YAML and invalid service configurations early
MEDIUM: Pin Actions in performance-monitor.yml
Issue: Floating tag
actions/checkout@v4creates supply chain risk.Solution: Pin to SHA:
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd(same as all other workflows).Complexity: Very Low
Impact: Low-Medium — consistency in security posture
LOW: Add SBOM to Release Workflow
Issue: No SBOM attached to releases.
Solution: Add
anchore/sbom-actiontorelease.ymlto generate and attach an SBOM artifact to each GitHub Release.Complexity: Low
Impact: Low (supply chain transparency)
LOW: Broader Link Check Trigger
Issue: Link check skips non-MD PRs that may rename referenced files.
Solution: Change
link-check.ymlto also trigger onpushtomain(post-merge check) so broken links are always caught.Complexity: Very Low
Impact: Low
📈 Metrics Summary
.yml+ 27.mdagentic)security-guard,build-test)cli.ts)docker-manager.tsat 18%)Assessment generated by AI agent on 2026-04-08. Based on analysis of
.github/workflows/configuration files,COVERAGE_SUMMARY.md,docs/INTEGRATION-TESTS.md, and recent GitHub Actions workflow run history.Beta Was this translation helpful? Give feedback.
All reactions