Skip to content

Conversation

@abadawi591
Copy link
Contributor

Merge Checklist

All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)

  • The toolchain has been rebuilt successfully (or no changes were made to it)
  • The toolchain/worker package manifests are up-to-date
  • Any updated packages successfully build (or no packages were changed)
  • Packages depending on static components modified in this PR (Golang, *-static subpackages, etc.) have had their Release tag incremented.
  • Package tests (%check section) have been verified with RUN_CHECK=y for existing SPEC files, or added to new SPEC files
  • All package sources are available
  • cgmanifest files are up-to-date and sorted (./cgmanifest.json, ./toolkit/scripts/toolchain/cgmanifest.json, .github/workflows/cgmanifest.json)
  • LICENSE-MAP files are up-to-date (./LICENSES-AND-NOTICES/SPECS/data/licenses.json, ./LICENSES-AND-NOTICES/SPECS/LICENSES-MAP.md, ./LICENSES-AND-NOTICES/SPECS/LICENSE-EXCEPTIONS.PHOTON)
  • All source files have up-to-date hashes in the *.signatures.json files
  • sudo make go-tidy-all and sudo make go-test-coverage pass
  • Documentation has been updated to match any changes to the build system
  • Ready to merge

Summary

What does the PR accomplish, why was it needed?

Change Log
  • Change
  • Change
  • Change
Does this affect the toolchain?

YES/NO

Associated issues
  • #xxxx
Links to CVEs
Test Methodology
  • Pipeline build id: xxxx

@abadawi591 abadawi591 requested a review from a team as a code owner October 14, 2025 17:06
@microsoft-github-policy-service microsoft-github-policy-service bot added the 3.0-dev PRs Destined for AzureLinux 3.0 label Oct 14, 2025
@anphel31
Copy link
Member

can you please provide a more descriptive title/description?

Copy link
Contributor

@dmcilvaney dmcilvaney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't had time to dig deep, but added a few comments

- Patches referenced in spec but not found in directory
- Example: Patch0: security.patch (but file doesn't exist)
2. Unused patch files (WARNING):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically this has actually been the issue that caused us the most grief. If a patch is in the .spec file we get a build error, but it silently continues with the .spec file is missing the patch macro. I would mark this as an error rather than warning.

Maybe if there are multiple .spec files, its a warning if at least one uses the patch?

result.spec_results.append(spec_result)

# Trigger post-init calculations
result.__post_init__()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do

return MultiSpecAnalysisResult(
    spec_results = spec_result_list # gather this in the loop
    )   

I think this should auto run the calculations with all the data in place?

"""

# Call OpenAI (existing logic)
response = openai_client.chat.completions.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a nice spot for https://platform.openai.com/docs/api-reference/chat/create#chat-create-response_format, you could have the AI try to categorize the errors into buckets for you, or at least have something like

{
   "comments": "The spec file is missing cve-1234-567.patch, add it.",
   "has_issue": true
}

@dmcilvaney
Copy link
Contributor

In a more general sense, does this need to be a single monolithic PR, or could it be broken up a bit?

- Fixed parameter order in detect_all() method
- Was calling: detect_patch_file_issues(file_path, file_content, file_list)
- Should be: detect_patch_file_issues(file_content, file_path, file_list)
- This bug caused all patch files to be incorrectly flagged as 'unused'
- Also caused CVE mismatch false positives

Impact: Eliminates false positives and enables correct detection of:
- Missing patch files
- Unused patch files
- CVE patch mismatches

Tested with test/bad-patch-ref branch containing CVE-20225-1111.patch typo.
Correctly detects missing patch file error.
- Changed post_pr_comment() to post_comment() (correct method name)
- Added format_multi_spec_comment() to convert analysis_result to string
- Added logging to track comment posting success/failure
- Fixed: Comments were not being posted due to method name mismatch

This completes the GitHub integration for PR comments.
- Replaces GitHubClient.py with proven working implementation from 3.0-dev branch
- Uses post_pr_comment(body) method signature instead of post_comment(pr_number, comment)
- GitHubClient now extracts PR number from environment in __init__
- Should fix 401 Unauthorized errors when posting comments
- Fix: Call analyzer.generate_multi_spec_report() instead of non-existent github_client.format_multi_spec_comment()
- Enhance logging in post_pr_comment() to show detailed error info
- Log missing parameters, URL, status codes, and response bodies for debugging
- Add generate_html_report() method with modern dark-themed design
- Expandable/collapsible sections for specs, anti-patterns, and actions
- Color-coded severity indicators (red for errors, yellow for warnings, green for success)
- Enhanced markdown formatting with emojis, bold text, and tables
- Use include_html parameter: True for GitHub comments, False for text files
- Improved visual hierarchy with <details> tags for better readability
Features:
- 🔗 GitHub Gist integration: HTML report uploaded as secret Gist with styled badge button
- 📊 Sections expanded by default for better visibility
- 🏷️ Severity indicators on anti-pattern types (�� ERROR, ⚠️ WARNING, etc.)
- 🧪 Add test patch to avahi.spec for multi-spec testing (Patch10: CVE-2027-99999.patch)

Technical changes:
- Add GitHubClient.create_gist() method for programmatic Gist creation
- Update generate_multi_spec_report() to accept github_client parameter
- Generate self-contained HTML pages for Gist hosting
- Change <details> to <details open> for expanded sections by default
- Show severity emoji and level on each anti-pattern type heading
- Styled badge button using shields.io for 'View Interactive Report' link
All requested features implemented:

1. ✅ Delimiters between specs
   - Added subtle --- between packages (lighter than section separators)
   - Not shown after last package

2. ✅ Fixed Gist button
   - Changed from badge markdown to <a><img> HTML tags
   - Added fallback bold link below button
   - Better error logging for Gist creation
   - Falls back to embedded HTML if Gist fails

3. ✅ Per-spec recommended actions
   - Each spec now has its own 'Recommended Actions for [package]' section
   - Collapsible (open by default) with checkboxes
   - Overall 'All Recommended Actions' kept at bottom
   - Checkboxes are independent per GitHub limitations

4. ✅ Spec-level collapsible sections
   - Each package wrapped in <details open>
   - Summary shows: emoji + package name + severity
   - Prominent and collapsible but expanded by default
   - All nested sections also expanded

Visual improvements:
- Cleaner hierarchy with collapsible specs
- Better visual separation between packages
- Dual checkbox lists (per-spec + overall)
- Professional button/link for HTML report
Phase 1 - Finer Delimiters:
- Added *** delimiters between subsections within each spec
- Separates: spec metadata, anti-patterns, AI analysis, recommended actions
- Lighter than main section delimiters for better visual hierarchy

Phase 2 - RPM Macro Expansion:
- Added _extract_spec_macros() to parse Name, Version, Release, %global, %define
- Added _expand_macros() for recursive macro expansion (handles %{name} and %name formats)
- Updated detect_patch_file_issues() to expand macros before file existence checks
- Fixes false positives like '%{name}-libevent-pc-fix.patch' not matching 'avahi-libevent-pc-fix.patch'
- Error messages show both original and expanded filenames when they differ

This solves the avahi false positive issue where macro expansion was needed.
- Replaced badge/image with clear markdown link
- Added dedicated section header: '## 📊 Interactive HTML Report'
- Used large heading with emoji for better visibility
- Added descriptive text about what the link does
- Surrounded with horizontal rules for prominence
- Simple, clean, and guaranteed to be clickable
…cation

Implementation:
- Add BlobStorageClient.py for uploading HTML reports to Azure Blob Storage
- Integrate blob storage in CveSpecFilePRCheck.py with automatic UMI auth
- Update ResultAnalyzer.py with dual upload strategy (blob first, Gist fallback)
- Use DefaultAzureCredential for automatic UMI detection in ADO pipeline
- Add comprehensive error handling and graceful degradation
- Update requirements.txt with azure-storage-blob and azure-identity

Features:
- Automatic UMI authentication (no pipeline YAML changes needed)
- Blob storage preferred, Gist as fallback (maintains existing functionality)
- Public blob URLs for HTML reports (no auth required)
- Hierarchical organization: PR-{number}/report-{timestamp}.html
- Zero breaking changes (pipeline works with or without admin permissions)

Admin Prerequisites (REQUIRED before blob storage works):
1. Grant UMI (Principal ID: 4cb669bf-1ae6-463a-801a-2d491da37b9d) Storage Blob Data Contributor role
2. Configure blob-level public access on radarcontainer

See PRODUCTION_DEPLOYMENT_GUIDE.md for complete deployment instructions.
See IMPLEMENTATION_COMPLETE.md for admin action checklist.
Fixes:
- Remove stale update_check_status() call that doesn't exist in GitHubClient
- Eliminates 'object has no attribute' error in pipeline logs

Enhanced Logging in BlobStorageClient:
- Log initialization steps (credential creation, service client, connection test)
- Log detailed upload progress (blob path, size, content-type, ETag)
- Log blob verification after upload (size, last modified)
- Log Azure error codes and detailed error messages on failure
- Use emojis (📤 ⬆️ ✅ ❌) for easy log scanning

New Debug Methods:
- list_blobs_in_container() - List all blobs with sizes
- verify_blob_exists() - Check specific blob and log properties
- Enhanced test_connection() - Show public access level

This will help diagnose why blobs aren't appearing in container despite
successful upload indication. Next pipeline run will show exactly where
blob upload succeeds or fails.
Problem: Blob upload succeeds but ResourceNotFound when accessing public URL
Root Cause: Container likely doesn't exist or has no public access configured

Diagnostic Features Added:
- List all containers in storage account on initialization
- Check if target container exists and log its public access level
- Verify uploaded blob appears in container listing after upload
- Comprehensive logging of container configuration

Auto-Create Container Feature:
- Automatically create container if it doesn't exist
- Set public access to 'blob' level on creation
- If container exists but has no public access, attempt to set it
- Logs all container operations and configuration changes

Post-Upload Verification:
- After blob upload, list blobs with PR prefix
- Confirm uploaded blob appears in container listing
- Log warning if blob not found in listing despite successful upload

Expected Results:
- Will identify if container is missing (and create it)
- Will identify if public access is not configured (and set it)
- Will show all containers and their public access levels
- Will verify blob exists in both API and container listing

This should resolve ResourceNotFound errors by ensuring:
1. Container exists
2. Container has blob-level public access
3. Blobs are actually being created in the correct container
- Remove _ensure_container_exists_with_public_access() method
- Keep diagnostic logging to verify container status
- Container public access manually set to 'blob' level via Azure CLI
- Blobs will now be publicly accessible when uploaded
- Updated ResultAnalyzer.py AUTH_CALLBACK_URL to use radarfunc-eka5fmceg4b5fub0.canadacentral-01.azurewebsites.net
- Matches deployed Azure Function endpoint
- Enables GitHub OAuth authentication in HTML reports
- Follow GitHubClient pattern: use GITHUB_TOKEN instead of GITHUB_BOT_PAT
- Pipeline adds 'radar-issues-detected' label when posting PR check comments
- Azure Function adds 'radar-acknowledged' label when challenges submitted
- Add add_label() method to GitHubClient for consistent label management
- Bot posts comments using GITHUB_TOKEN, includes user attribution

Complete workflow:
1. Pipeline detects issues → posts comment + adds 'radar-issues-detected' label
2. User submits challenge → bot posts comment + adds 'radar-acknowledged' label
- Follow GitHubClient pattern: use GITHUB_TOKEN instead of GITHUB_BOT_PAT
- Pipeline adds 'radar-issues-detected' label when posting PR check comments
- Azure Function adds 'radar-acknowledged' label when challenges submitted
- Add add_label() method to GitHubClient for consistent label management
- Bot posts comments using GITHUB_TOKEN, includes user attribution

Complete workflow:
1. Pipeline detects issues → posts comment + adds 'radar-issues-detected' label
2. User submits challenge → bot posts comment + adds 'radar-acknowledged' label
- Use html.escape() with quote=True for proper HTML attribute escaping
- Fixes JavaScript syntax error caused by unescaped characters in data attributes
- Change markdown link to HTML anchor tag with target=_blank
- Update user message to indicate automatic new tab opening
This commit adds antipatterns across 4 different specs to test all detection types:

1. curl.spec (already modified):
   - Future year CVEs (CVE-2035-11111)
   - Duplicate patches (CVE-2025-0665 appears twice)
   - Invalid CVE format (CVE-202X-INVALID)
   - Non-CVE security patches

2. python-tomli.spec:
   - Macro expansion in CVE patch names
   - Future year macros (%{future_year})
   - Complex macro combinations (%{cve_base}%{cve_year})

3. openssl.spec:
   - Lowercase CVE format (cve-2024-11111)
   - Leading zeros (CVE-2024-00999)
   - Old year (CVE-1999-00001)
   - Future year (CVE-2026-11111)
   - Combined CVE patches

4. nginx.spec:
   - Duplicate patches (CVE-2024-7347)
   - Invalid format (CVE-202X-INVALID)
   - Combined CVE names
   - Non-CVE security patches

This test will trigger all antipattern detectors and generate a comprehensive HTML report.
The description was being escaped for the data-description attribute
but not for the actual display text in the <li> tag. This caused
syntax errors when descriptions contained special characters like
quotes or brackets, breaking the entire JavaScript on the page.

Now using escaped_desc for both the display text and the attribute.
The description was being escaped for the data-description attribute
but not for the actual display text in the <li> tag. This caused
syntax errors when descriptions contained special characters like
quotes or brackets, breaking the entire JavaScript on the page.

Now using escaped_desc for both the display text and the attribute.
Changed \n to \n in JavaScript alert strings to prevent literal
newlines from breaking JavaScript syntax. This was causing a syntax
error at line 571 that prevented all JavaScript from loading.
Changed \n to \n in JavaScript alert strings to prevent literal
newlines from breaking JavaScript syntax. This was causing a syntax
error at line 571 that prevented all JavaScript from loading.
…ures

1. Token Expiration UX:
   - Detect 401 responses and 'Token expired' errors
   - Show clear message: 'Your session has expired! Please sign in again'
   - Automatically sign out user to clear expired token
   - Close modal and prompt re-authentication

2. GitHub API Diagnostics:
   - Add diagnostic info to Azure Function response
   - Include status codes and error messages for failed operations
   - Show specific errors in alert message (comment/label failures)
   - Log whether bot token or user token is being used

This helps users understand why operations fail and guides them
to fix issues (e.g., create missing labels, re-authenticate).
… behavior

1. GitHub PAT Authentication (401 Bad Credentials):
   - Added detailed logging in function_app.py to show token length, prefix, and source
   - Enhanced diagnostics in API response with bot_token_length and bot_token_prefix
   - Function restarted to ensure Key Vault reference resolution
   - Next HTML will show exact token status for debugging

2. PR Metadata Display:
   - Modified generate_html_report() to accept optional pr_metadata parameter
   - Added PR information card showing: PR number, title, author, branches, commit SHA
   - Card displays after main title with dark theme styling
   - Metadata passed from generate_multi_spec_report()

3. HTML Link Opening in New Tab:
   - Changed from <a href=...target='_blank'> (unsupported in GitHub Markdown)
   - Now uses standard Markdown link with user instructions
   - Added tip: 'Right-click and Open link in new tab' or Ctrl+Click

All changes deployed to Azure Function and ready for next pipeline run.
@abadawi591 abadawi591 force-pushed the abadawi/multi-spec-radar branch 4 times, most recently from debfe23 to d757176 Compare October 23, 2025 23:59
1. Add get_pr_metadata() method to GitHubClient to fetch actual PR author
2. Call get_pr_metadata() in CveSpecFilePRCheck.py before generating report
3. Fix YAML to use correct variable: $(cblmarghGithubPRPat) instead of $(githubPrPat)
4. Add datetime import to GitHubClient

This fixes:
- PR author showing 'Unknown' in HTML report metadata
- Pipeline using wrong GitHub PAT variable name
Change from markdown link to HTML anchor tag with target="_blank"
to ensure the interactive HTML report opens in a new browser tab
when clicked from GitHub PR comments.
Changes:
- Add get_github_token() function to fetch token from Key Vault using Managed Identity
- Replace all GITHUB_TOKEN env var references with Key Vault lookups
- Add azure-keyvault-secrets>=4.7.0 to requirements.txt
- Token is cached after first retrieval for performance

Benefits:
- No need to manually update Function App environment variables
- Token is always up-to-date from Key Vault
- Centralized token management in one place (Key Vault)
- Uses Managed Identity for secure access
Issue: Comments were posted by user account instead of CBL Mariner Bot

Root cause:
- YAML line 119 used $(githubPrPat) which doesn't exist
- Caused GITHUB_TOKEN to be empty
- GitHubClient fell back to SYSTEM_ACCESSTOKEN (user's token)

Fix:
- Add mariner-pipelines-kv-secrets variable group
- Change GITHUB_TOKEN from $(githubPrPat) to $(cblmarghGithubPRPat)
- Now uses correct bot PAT from Key Vault

Result: All comments (PR check + challenge) now post as CBL Mariner Bot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0-dev PRs Destined for AzureLinux 3.0 Packaging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants