Skip to content
Open
41 changes: 41 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -314,3 +314,44 @@ You can inject CodeQL into the build tool as such as custom command; for compile
It will be necessary to keep track of the CodeQL category assigned to each build target, to avoid clashes.

See the GitHub documentation on [Using code scanning with your existing CI system](https://docs.github.com/en/enterprise-cloud@latest/code-security/code-scanning/integrating-with-code-scanning/using-code-scanning-with-your-existing-ci-system)

### Alternative SARIF Republishing: `republish-filtered-sarif` Action

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action name in the heading is inconsistent with the directory name. The directory is 'republish-filtered-sarifs' (plural) but the heading uses 'republish-filtered-sarif' (singular). This should match the actual directory name for consistency.

Copilot uses AI. Check for mistakes.

A new GitHub Action, `republish-filtered-sarif`, is now available to streamline Code Scanning results presentation on Pull Requests within a monorepo.

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action name should be 'republish-filtered-sarifs' (plural) to match the directory name.

Copilot uses AI. Check for mistakes.

**Problem Addressed:**
In monorepo PR workflows, Code Scanning checks for unscanned projects may appear incomplete. While the existing `republish-sarif` action addresses this, it often requires maintaining a `projects.json` file to define all projects for republishing. This can be an overhead for users.

**Solution (`republish-filtered-sarif` Action):**

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action name in parentheses should be 'republish-filtered-sarifs' (plural) to match the directory name.

Copilot uses AI. Check for mistakes.
This composite action provides a quick, easy, and **`projects.json`-agnostic** way to ensure a complete Code Scanning overview on PRs. It works by:

1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the `main` branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the `main` branch, it selects and downloads only the *most recent* SARIF.
Comment on lines +328 to +330

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states analyses are published to the 'main' branch, but the code actually queries the repository's default branch dynamically (which may not be 'main'). Consider updating the documentation to say 'default branch' instead of specifically 'main' to be more accurate.

Suggested change
1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the `main` branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the `main` branch, it selects and downloads only the *most recent* SARIF.
1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the repository's default branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the repository's default branch, it selects and downloads only the *most recent* SARIF.

Copilot uses AI. Check for mistakes.
Comment on lines +328 to +330

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to line 328, this should say 'default branch' instead of 'main branch' to accurately reflect the implementation.

Suggested change
1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the `main` branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the `main` branch, it selects and downloads only the *most recent* SARIF.
1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the `default` branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the `default` branch, it selects and downloads only the *most recent* SARIF.

Copilot uses AI. Check for mistakes.
4. **Direct API Republishing:** These filtered SARIFs are then directly uploaded to the current Pull Request's commit SHA via the GitHub Code Scanning API.

**Key Benefit:**
This action **simplifies the republishing process by removing the need for a `projects.json` file** for this step. Users provide the CodeQL `category` value of the currently scanned project, and the action automatically handles the rest, offering a streamlined, category-based approach for comprehensive PR security insights.

**Example Usage:**

```yaml
# In your monorepo workflow (e.g., for a 'backend' project)

jobs:
analyze:
steps:
# ... (checkout, setup, CodeQL init, build steps) ...

- name: Perform CodeQL Analysis (Backend)
uses: github/codeql-action/analyze@v3
with:
category: 'backend' # <--- This unique category string is key!

# ... (other steps, like dependency submission) ...

- name: Republish other SARIFs for PR
if: github.event_name == 'pull_request'
uses: your-org/your-repo/.github/actions/republish-filtered-sarif@main # Update this path

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action path should be 'republish-filtered-sarifs' (plural) to match the directory name.

Suggested change
uses: your-org/your-repo/.github/actions/republish-filtered-sarif@main # Update this path
uses: your-org/your-repo/.github/actions/republish-filtered-sarifs@main # Update this path

Copilot uses AI. Check for mistakes.
with:
excluded-category: 'backend' # Pass the category of the project just scanned
111 changes: 111 additions & 0 deletions republish-filtered-sarifs/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# .github/actions/republish-sarifs/action.yml

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment path '.github/actions/republish-sarifs/action.yml' does not match the actual directory name 'republish-filtered-sarifs' used in the PR. This should be updated to reflect the correct path.

Suggested change
# .github/actions/republish-sarifs/action.yml
# .github/actions/republish-filtered-sarifs/action.yml

Copilot uses AI. Check for mistakes.

name: Download & Republish Filtered SARIFs
description: 'Downloads Code Scanning SARIF files, excludes a specified category, uploads them as an artifact, and then republishes the remaining to the PR/target branch.'
inputs:
excluded-category:
description: 'The single CodeQL category string to exclude from the download and republish (e.g., /language:java;project:backend-service).'
required: true
type: string

runs:
using: 'composite'
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Run SARIF Download Script
uses: actions/github-script@v7
env:
EXCLUDED_CATEGORY_INPUT: ${{ inputs.excluded-category }}
with:
script: |
const scriptPath = `${process.env.GITHUB_ACTION_PATH}/download_filtered_sarifs.js`;
const script = require(scriptPath);
script(github, context, core);
github-token: ${{ github.token }}

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action uses 'github.token' without explicitly declaring it as an input. While this works because it uses the automatic GitHub token from the workflow context, it would be more explicit and allow users to pass a custom token if needed by declaring it as an input with a default value of '${{ github.token }}'.

Copilot uses AI. Check for mistakes.

- name: Upload Downloaded SARIFs as Artifact
uses: actions/upload-artifact@v4
with:
name: filtered-sarif-downloads
path: sarif_downloads/
if-no-files-found: ignore

# Scenario 1: PR mode - upload SARIF files via GitHub API to the PR itself
- name: Upload Filtered SARIF files to PR Code Scanning
if: github.event_name == 'pull_request' && github.event.pull_request.merged != true && hashFiles('sarif_downloads/*.sarif') != ''
shell: bash
env:
GH_TOKEN: ${{ github.token }}
run: |
set -x # Keep debugging enabled for now, can remove after successful run

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'set -x' debug mode is left enabled with a comment indicating it can be removed later. Consider removing this before merging to production to avoid cluttering logs with excessive debug output, or make it conditional on a debug input parameter.

Copilot uses AI. Check for mistakes.

echo "Uploading filtered SARIF files to PR via GitHub Code Scanning API"

SARIF_COUNT=$(find sarif_downloads -name "*.sarif" | wc -l)
echo "Found $SARIF_COUNT SARIF files to upload"

REPO_OWNER="${GITHUB_REPOSITORY%/*}"
REPO_NAME="${GITHUB_REPOSITORY#*/}"

for SARIF_FILE in sarif_downloads/*.sarif; do

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop uses a glob pattern 'sarif_downloads/*.sarif' without proper quoting around the variable expansion. If the SARIF_FILE variable contains spaces or special characters in its path, the script could fail or behave unexpectedly. While the filename generation in the JavaScript sanitizes the names, it's a best practice to quote variable expansions in shell scripts to prevent word splitting and globbing issues.

Copilot uses AI. Check for mistakes.
echo "Processing $SARIF_FILE"

# 1. Gzip and base64 encode the SARIF file content into a temporary file.
# This is the safest way to handle potentially very large data.
TEMP_BASE64_FILE=$(mktemp)
gzip -c "$SARIF_FILE" | base64 -w0 > "$TEMP_BASE64_FILE"

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base64 encoding uses the '-w0' flag which is GNU coreutils specific and may not work on macOS or other Unix systems that use BSD coreutils. If this action is intended to run on self-hosted runners that might use different operating systems, consider using a more portable approach or documenting that this action requires GNU coreutils.

Suggested change
gzip -c "$SARIF_FILE" | base64 -w0 > "$TEMP_BASE64_FILE"
gzip -c "$SARIF_FILE" | base64 | tr -d '\n' > "$TEMP_BASE64_FILE"

Copilot uses AI. Check for mistakes.

# 2. Reconstruct the category name from the filename (for logging only, not payload)
CATEGORY_NAME=$(basename "$SARIF_FILE" .sarif | sed 's/_/\//g' | sed 's/\.yml:/yml:/g' | sed 's/^analyze\//\/analyze\//g')

echo "Uploading $(basename "$SARIF_FILE") with category '$CATEGORY_NAME' to PR"
Comment on lines +60 to +63

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The category name reconstruction logic from the filename is fragile and likely incorrect. The script replaces underscores with slashes and attempts to reconstruct the original category, but this is the reverse of the sanitization done in the JavaScript file (line 115) which replaces all non-alphanumeric characters with underscores. This reconstruction will not work correctly for categories that originally contained underscores, and the sed patterns for '.yml:' and 'analyze/' appear to be specific to certain category formats that may not be universal. Since this is only used for logging, consider simplifying or documenting that this is a best-effort reconstruction.

Suggested change
# 2. Reconstruct the category name from the filename (for logging only, not payload)
CATEGORY_NAME=$(basename "$SARIF_FILE" .sarif | sed 's/_/\//g' | sed 's/\.yml:/yml:/g' | sed 's/^analyze\//\/analyze\//g')
echo "Uploading $(basename "$SARIF_FILE") with category '$CATEGORY_NAME' to PR"
# 2. Derive a best-effort category name from the filename for logging only.
# The original category string cannot be reliably reconstructed because the
# sanitization step replaces all non-alphanumeric characters with underscores.
# Therefore we just use the sanitized basename here for informational logs.
CATEGORY_NAME=$(basename "$SARIF_FILE" .sarif)
echo "Uploading $(basename "$SARIF_FILE") with sanitized category name '$CATEGORY_NAME' to PR"

Copilot uses AI. Check for mistakes.

if [ ! -s "$TEMP_BASE64_FILE" ]; then # -s checks if file exists and is not empty
echo "Error: SARIF content for $SARIF_FILE is empty after base64 encoding or file creation failed."
rm -f "$TEMP_BASE64_FILE"
exit 1
fi

# 3. Construct the full JSON payload using jq
# We use `--rawfile sarif_data "$TEMP_BASE64_FILE"` to read the file content
# directly into the 'sarif_data' variable as a raw string.
JSON_PAYLOAD=$(jq -n \
--arg ref_val "refs/pull/${{ github.event.pull_request.number }}/merge" \
--arg commit_val "${{ github.sha }}" \
--rawfile sarif_data "$TEMP_BASE64_FILE" \
'{sarif: $sarif_data, ref: $ref_val, commit_sha: $commit_val}')

# Clean up the temporary file immediately after jq has used it
rm -f "$TEMP_BASE64_FILE"

echo "DEBUG: JSON_PAYLOAD (sarif content redacted):"
# Pretty-print the JSON_PAYLOAD but replace the 'sarif' field's value with "REDACTED"
echo "$JSON_PAYLOAD" | jq 'if .sarif then .sarif = "REDACTED" else . end'

if [ -z "$JSON_PAYLOAD" ]; then
echo "Error: JSON_PAYLOAD is empty. jq command failed to produce output."
exit 1
fi

# 4. Pipe the JSON payload to gh api --input -
printf "%s" "$JSON_PAYLOAD" | gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/$REPO_OWNER/$REPO_NAME/code-scanning/sarifs \
--input - \
--jq '.id'

if [ $? -eq 0 ]; then
echo "✓ Successfully uploaded $(basename "$SARIF_FILE")"
else
echo "✗ Failed to upload $(basename "$SARIF_FILE")"
exit 1
fi

sleep 1

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a sleep of 1 second between uploads could significantly slow down the action when there are many SARIF files to upload. Consider documenting why this sleep is necessary (e.g., API rate limiting) or make it configurable. If it's to avoid rate limits, consider implementing exponential backoff only on rate limit errors instead of a blanket delay.

Copilot uses AI. Check for mistakes.
done
set +x

156 changes: 156 additions & 0 deletions republish-filtered-sarifs/download_filtered_sarifs.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
// download_filtered_sarifs.js

const fs = require('node:fs');
const path = require('node:path');

async function run(github, context, core) {
const repo = context.repo;

// --- Input: Get the single excluded category from an environment variable ---
// This value will be passed from your GitHub Actions workflow step (see workflow example below).
const EXCLUDED_CATEGORY = process.env.EXCLUDED_CATEGORY_INPUT;

if (!EXCLUDED_CATEGORY || EXCLUDED_CATEGORY.trim() === '') {
core.setFailed("Error: 'EXCLUDED_CATEGORY_INPUT' environment variable is required and cannot be empty.");
return;
}

// --- Dynamically determine the default branch for downloading analyses ---
let targetRef;
try {
const { data: repoData } = await github.rest.repos.get({
owner: repo.owner,
repo: repo.repo,
});
targetRef = `refs/heads/${repoData.default_branch}`;
core.info(`Dynamically determined default branch: '${repoData.default_branch}'. Will download analyses from '${targetRef}'.`);
} catch (error) {
core.setFailed(`Failed to get default branch for ${repo.owner}/${repo.repo}: ${error.message}`);
return;
}

core.info(`EXCLUDING SARIFs with category: '${EXCLUDED_CATEGORY}'`);


const DOWNLOAD_DIR = 'sarif_downloads';
if (!fs.existsSync(DOWNLOAD_DIR)) {
fs.mkdirSync(DOWNLOAD_DIR, { recursive: true }); // Ensure parent directories are created
core.info(`Created directory: ${DOWNLOAD_DIR}`);
} else {
core.info(`Directory already exists: ${DOWNLOAD_DIR}`);
}

let allAnalyses = [];
let page = 1;
let hasNextPage = true;

// --- 1. Fetch all recent Code Scanning analyses for the targetRef ---
try {
while (hasNextPage) {
const response = await github.rest.codeScanning.listRecentAnalyses({
owner: repo.owner,
repo: repo.repo,
ref: targetRef,
per_page: 100, // Fetch 100 analyses per page
page: page,
tool_name: 'CodeQL' // Optionally filter by tool if you only want CodeQL SARIFs
});

allAnalyses = allAnalyses.concat(response.data);

// Check if there are more pages to fetch
if (response.data.length < 100) {
hasNextPage = false;
} else {
page++;
}
}
core.info(`Found ${allAnalyses.length} total recent analyses for ref: '${targetRef}'`);
} catch (error) {
core.setFailed(`Failed to list recent analyses for '${targetRef}': ${error.message}`);
// Provide more detail for common errors like 404 (no analyses found)
if (error.status === 404) {
core.warning(`No CodeQL analyses found for ref: '${targetRef}'. Ensure analyses exist for this branch.`);
}
return;
}

// --- 2. Filter out analyses based on the single EXCLUDED_CATEGORY ---
const analysesToDownload = [];
const categoriesSeen = new Set(); // To ensure we only download the most recent analysis for each unique category

// Sort by created_at (most recent first) to ensure we get the latest if multiple analyses exist for a category
allAnalyses.sort((a, b) => new Date(b.created_at).getTime() - new Date(a.created_at).getTime());

for (const analysis of allAnalyses) {
// Check if the category matches our single excluded category
if (analysis.category === EXCLUDED_CATEGORY) {
core.info(`Skipping analysis for excluded category: '${analysis.category}' (ID: ${analysis.id})`);
continue; // Skip to the next analysis if it's the excluded one
}

// Ensure we only download the most recent analysis for each unique category found after exclusion
if (!categoriesSeen.has(analysis.category)) {
categoriesSeen.add(analysis.category);
analysesToDownload.push(analysis);
} else {
core.debug(`Skipping older analysis for category '${analysis.category}' (ID: ${analysis.id})`);
}
}

if (analysesToDownload.length === 0) {
core.info("No analyses found to download after filtering. Exiting.");
return;
}

core.info(`Attempting to download ${analysesToDownload.length} SARIF files after filtering.`);

// --- 3. Download the filtered SARIF files ---
// Use Promise.all to download files concurrently
await Promise.all(analysesToDownload.map(async (analysis) => {
try {
const sarifId = analysis.id;
const category = analysis.category;
// Generate a clean filename from the category
const fileName = category.replace(/[^a-z0-9_]/gi, '_').toLowerCase() + '.sarif';
Comment on lines +114 to +115

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename sanitization using regex replacement could lead to collisions. Categories with different characters that get replaced with underscores could produce the same filename. For example, 'backend-api' and 'backend/api' would both become 'backend_api.sarif'. Consider using a more robust approach such as encoding the category name or including the analysis ID in the filename to ensure uniqueness.

Suggested change
// Generate a clean filename from the category
const fileName = category.replace(/[^a-z0-9_]/gi, '_').toLowerCase() + '.sarif';
// Generate a clean, mostly human-readable filename from the category and ensure uniqueness with the analysis ID
const sanitizedCategory = category.replace(/[^a-z0-9_]/gi, '_').toLowerCase();
const fileName = `${sanitizedCategory}_${sarifId}.sarif`;

Copilot uses AI. Check for mistakes.
const filePath = path.join(DOWNLOAD_DIR, fileName);

core.info(`Downloading SARIF for category '${category}' (ID: ${sarifId})`);

// Make a direct API call to get the SARIF content
const sarifResponse = await github.rest.codeScanning.getAnalysis({
owner: repo.owner,
repo: repo.repo,
analysis_id: sarifId,
headers: {
Accept: "application/sarif+json", // Crucial: Request SARIF JSON directly
},
});

const sarifContent = sarifResponse.data;

// Basic validation for received content
if (!sarifContent || (typeof sarifContent === 'object' && Object.keys(sarifContent).length === 0) || (typeof sarifContent === 'string' && sarifContent.trim().length === 0)) {
Comment on lines +132 to +133

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation check is overly complex. The condition checks for empty objects, empty strings, and falsy values, but if the API returns empty content, it's likely always going to be the same type. Consider simplifying this check or documenting what specific scenarios each part of the condition handles.

Suggested change
// Basic validation for received content
if (!sarifContent || (typeof sarifContent === 'object' && Object.keys(sarifContent).length === 0) || (typeof sarifContent === 'string' && sarifContent.trim().length === 0)) {
// Basic validation for received content.
// GitHub's code scanning SARIF API returns JSON objects when requesting "application/sarif+json".
// Treat null/undefined or an object with no own properties as empty.
if (!sarifContent || (typeof sarifContent === 'object' && Object.keys(sarifContent).length === 0)) {

Copilot uses AI. Check for mistakes.
core.warning(`SARIF content received is empty or malformed for analysis ID ${analysis.id} (category: '${category}'). Skipping write.`);
core.debug(`Received SARIF content (raw): ${sarifContent}`);
return;
}

// Write the SARIF content to a file
fs.writeFileSync(filePath, JSON.stringify(sarifContent, null, 2)); // Pretty print JSON for readability
core.info(`Successfully downloaded SARIF for '${category}' to '${filePath}'`);

} catch (error) {
core.error(`Failed to download SARIF for analysis ID ${analysis.id} (category: '${analysis.category}'): ${error.message}`);
// Log the error but don't fail the whole job so other downloads can proceed
// If you want the job to fail on any download error, you'd re-throw or setFailed here.
}
}));

core.info("All filtered SARIF files processed and downloaded.");
}

// This is the entry point for actions/github-script
module.exports = (github, context, core) => {
run(github, context, core).then(() => {});

Copilot AI Dec 19, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The promise returned by run() is not being handled for errors. If the run() function rejects or throws an error, it will result in an unhandled promise rejection. Consider adding a .catch() handler to properly handle and log any errors that might occur during execution.

Suggested change
run(github, context, core).then(() => {});
run(github, context, core).catch((error) => {
if (core && typeof core.setFailed === 'function') {
core.setFailed(`Unhandled error in download_filtered_sarifs run: ${error && error.message ? error.message : error}`);
} else {
// Fallback logging if core is not available
console.error('Unhandled error in download_filtered_sarifs run:', error);
}
});

Copilot uses AI. Check for mistakes.
};