ci: Added Post deployment Script into pipeline and integrated Smoke testing automation#748
Conversation
…enhance report paths
…l identification in scripts
…y handling of service principals
There was a problem hiding this comment.
Pull request overview
This pull request enhances the CI/CD pipeline by integrating post-deployment scripts, smoke testing automation, and improving support for both user and service principal authentication scenarios. The changes focus on making deployments more reliable and flexible across different use cases (telecom and IT helpdesk).
Changes:
- Enhanced deployment workflows with ODBC Driver 18 installation and sample data processing with retry logic for both Linux and Windows runners
- Improved infrastructure scripts to support service principal authentication using SID-based SQL user creation, eliminating MS Graph permission requirements
- Modified E2E test automation to support targeted test execution based on environment variables for different use cases
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/deploy-windows.yml | Updated workflow name to indicate version 2 |
| .github/workflows/deploy-linux.yml | Updated workflow name to indicate version 2 |
| .github/workflows/job-deploy-windows.yml | Added ODBC installation, sample data processing with retry, and improved output extraction using azd commands |
| .github/workflows/job-deploy-linux.yml | Added ODBC installation, sample data processing with retry, and improved output extraction using azd commands |
| .github/workflows/job-docker-build.yml | Added DOCKER_BUILD_SUMMARY environment variable to suppress verbose output |
| .github/workflows/job-azure-deploy.yml | Changed default image tag from 'latest' to 'latest_waf' for main branch |
| .github/workflows/deploy-orchestrator.yml | Enabled E2E tests conditionally based on deployment success and input parameters |
| .github/workflows/job-test-automation.yml | Added use case-based test selection logic and expanded artifact uploads to include screenshots |
| infra/main.parameters.json | Added usecase parameter for environment-specific configuration |
| infra/main.waf.parameters.json | Added usecase parameter for environment-specific configuration |
| infra/scripts/add_user_scripts/assign_sql_roles.py | Added SID-based user creation for service principals/managed identities |
| infra/scripts/run_create_index_scripts.sh | Enhanced to detect and handle both user and service principal authentication |
| infra/scripts/copy_kb_files.sh | Enhanced to detect authentication type and added retry logic for role propagation and file uploads |
| infra/scripts/fabric_scripts/run_fabric_items_scripts.sh | Enhanced to detect and handle both user and service principal authentication |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ! [[ "18.04 20.04 22.04 24.04 24.10" == *"$(grep VERSION_ID /etc/os-release | cut -d '"' -f 2)"* ]]; | ||
| then | ||
| echo "Ubuntu $(grep VERSION_ID /etc/os-release | cut -d '"' -f 2) is not currently supported."; | ||
| exit; |
There was a problem hiding this comment.
The ODBC Driver 18 installation step uses 'exit' instead of 'exit 1' when the Ubuntu version is not supported. This will exit with status code 0, making it appear as if the step succeeded when it should fail. This could cause subsequent steps that depend on the ODBC driver to fail in unexpected ways. Change 'exit' to 'exit 1' to properly signal the error.
| exit; | |
| exit 1; |
| if [ "$account_type" == "user" ]; then | ||
| # Running as a user - get signed-in user ID | ||
| signed_user_id=$(az ad signed-in-user show --query id -o tsv) | ||
| elif [ "$account_type" == "servicePrincipal" ]; then | ||
| # Running as a service principal - get SP object ID | ||
| client_id=$(az account show --query user.name --output tsv 2>/dev/null) | ||
| if [ -n "$client_id" ]; then | ||
| signed_user_id=$(az ad sp show --id "$client_id" --query id --output tsv 2>/dev/null) | ||
| fi | ||
| else | ||
| echo "Error: Unknown account type: $account_type" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
The script lacks error handling for the service principal case. If az ad sp show fails or returns an error, the script will continue with an empty signed_user_id and fail later at line 28 with a generic error message. Additionally, the user case (lines 13-15) also lacks error validation. Consider adding error checking similar to the pattern used in other scripts (copy_kb_files.sh and run_create_index_scripts.sh) to provide clearer error messages when authentication information cannot be retrieved.
| # Download the ODBC Driver installer | ||
| $installerUrl = "https://go.microsoft.com/fwlink/?linkid=2249004" | ||
| $installerPath = "$env:TEMP\msodbcsql.msi" | ||
|
|
||
| Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath | ||
|
|
||
| # Install silently with EULA acceptance | ||
| Start-Process msiexec.exe -ArgumentList "/i `"$installerPath`" /quiet IACCEPTMSODBCSQLLICENSETERMS=YES" -Wait -NoNewWindow | ||
|
|
||
| # Clean up installer | ||
| Remove-Item $installerPath -Force | ||
|
|
||
| Write-Host "✅ ODBC Driver 18 installation complete" |
There was a problem hiding this comment.
The ODBC Driver installation step lacks error handling. If the download fails, the script will attempt to install a non-existent file, which could result in unclear error messages. If the installation command fails (e.g., due to system incompatibilities or permission issues), the step will still succeed because there is no explicit check of the return code. Consider adding error checks after the download and installation commands to ensure the driver is properly installed before proceeding to sample data processing.
| # Download the ODBC Driver installer | |
| $installerUrl = "https://go.microsoft.com/fwlink/?linkid=2249004" | |
| $installerPath = "$env:TEMP\msodbcsql.msi" | |
| Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath | |
| # Install silently with EULA acceptance | |
| Start-Process msiexec.exe -ArgumentList "/i `"$installerPath`" /quiet IACCEPTMSODBCSQLLICENSETERMS=YES" -Wait -NoNewWindow | |
| # Clean up installer | |
| Remove-Item $installerPath -Force | |
| Write-Host "✅ ODBC Driver 18 installation complete" | |
| # Download and install the ODBC Driver installer with error handling | |
| $installerUrl = "https://go.microsoft.com/fwlink/?linkid=2249004" | |
| $installerPath = "$env:TEMP\msodbcsql.msi" | |
| try { | |
| Write-Host "Downloading ODBC Driver installer from $installerUrl ..." | |
| Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath -UseBasicParsing -ErrorAction Stop | |
| if (-not (Test-Path $installerPath)) { | |
| throw "ODBC Driver installer was not downloaded to expected path: $installerPath" | |
| } | |
| Write-Host "Running ODBC Driver installer..." | |
| $msiArgs = "/i `"$installerPath`" /quiet IACCEPTMSODBCSQLLICENSETERMS=YES" | |
| $process = Start-Process msiexec.exe -ArgumentList $msiArgs -Wait -NoNewWindow -PassThru | |
| if ($process.ExitCode -ne 0) { | |
| throw "ODBC Driver installer exited with code $($process.ExitCode)." | |
| } | |
| Write-Host "✅ ODBC Driver 18 installation complete" | |
| } | |
| catch { | |
| Write-Error "Failed to install ODBC Driver 18 for SQL Server: $($_.Exception.Message)" | |
| if (Test-Path $installerPath) { | |
| Remove-Item $installerPath -Force -ErrorAction SilentlyContinue | |
| } | |
| exit 1 | |
| } | |
| finally { | |
| if (Test-Path $installerPath) { | |
| Remove-Item $installerPath -Force -ErrorAction SilentlyContinue | |
| } | |
| } |
| "virtualMachineAdminPassword": { | ||
| "value": "${AZURE_ENV_VM_ADMIN_PASSWORD}" | ||
| }, | ||
| "usecase":{ |
There was a problem hiding this comment.
The parameter definition for "usecase" is inconsistently formatted compared to other parameters in this file. It uses "usecase" as the key name, while most parameters in the codebase appear to follow camelCase convention (e.g., "solutionName", "enableTelemetry"). Consider using "useCase" instead of "usecase" for consistency with the existing naming pattern, or ensure this matches the expected parameter name in the infrastructure templates.
| "usecase":{ | |
| "useCase":{ |
| "frontendContainerRegistryHostname": { | ||
| "value": "${AZURE_ENV_CONTAINER_REGISTRY_ENDPOINT}" | ||
| }, | ||
| "usecase":{ |
There was a problem hiding this comment.
The parameter definition for "usecase" is inconsistently formatted compared to other parameters in this file. It uses "usecase" as the key name, while most parameters in the codebase appear to follow camelCase convention (e.g., "solutionName", "resourceGroupName"). Consider using "useCase" instead of "usecase" for consistency with the existing naming pattern, or ensure this matches the expected parameter name in the infrastructure templates.
| "usecase":{ | |
| "useCase":{ |
| elif [ "$account_type" == "servicePrincipal" ]; then | ||
| # Running as a service principal - get SP object ID | ||
| client_id=$(az account show --query user.name --output tsv 2>/dev/null) | ||
| if [ -n "$client_id" ]; then | ||
| signed_user_id=$(az ad sp show --id "$client_id" --query id --output tsv 2>/dev/null) | ||
| fi | ||
| if [ -z "$signed_user_id" ]; then | ||
| echo "✗ Failed to get signed-in user ID after re-authentication" | ||
| echo "✗ Failed to get service principal object ID" | ||
| exit 1 | ||
| fi | ||
| echo "✓ Running as service principal: $signed_user_id" |
There was a problem hiding this comment.
The error handling for the service principal case is missing validation after retrieving the signed_user_id. Unlike the user case (lines 54-63), there is no check to verify that the retrieved signed_user_id is not an ERROR string or contains error messages. If az ad sp show returns an error string instead of the object ID, the script will continue with an invalid ID, potentially causing cryptic failures later. Consider adding similar error checking as done for the user case.
| elif [ "$account_type" == "servicePrincipal" ]; then | ||
| # Running as a service principal - get SP object ID and display name | ||
| client_id=$(az account show --query user.name --output tsv 2>/dev/null) | ||
| if [ -n "$client_id" ]; then | ||
| sp_info=$(az ad sp show --id "$client_id" --query "{id:id, displayName:displayName}" -o json 2>/dev/null) | ||
| signed_user_id=$(echo "$sp_info" | grep -o '"id": *"[^"]*"' | head -1 | sed 's/"id": *"\([^"]*\)"/\1/') | ||
| signed_user_display_name=$(echo "$sp_info" | grep -o '"displayName": *"[^"]*"' | sed 's/"displayName": *"\([^"]*\)"/\1/') | ||
| fi | ||
| if [ -z "$signed_user_id" ] || [ -z "$signed_user_display_name" ]; then | ||
| echo "✗ Failed to get service principal information" | ||
| exit 1 | ||
| fi | ||
| echo "✓ Running as service principal: $signed_user_display_name ($signed_user_id)" |
There was a problem hiding this comment.
The error handling for the service principal case is missing validation after retrieving the signed_user_id. Unlike the user case (lines 38-51), there is no check to verify that the retrieved signed_user_id is not an ERROR string or contains error messages. If az ad sp show returns an error string instead of the object ID, the script will continue with an invalid ID, potentially causing cryptic failures later. Consider adding similar error checking as done for the user case.
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi |
There was a problem hiding this comment.
The test execution logic does not handle the case where neither 'telecom' nor 'IT_helpdesk' use cases are specified. If azure_env_use_case has a different value or is empty, no tests will be executed, but the step will silently succeed. This could lead to false positives where the test step appears to pass but no tests were actually run. Consider adding an else clause to handle unknown use cases or adding validation to ensure a valid use case is provided.
| fi | |
| fi | |
| else | |
| echo "ERROR: Unknown or unsupported azure_env_use_case: '${{ env.azure_env_use_case }}'. No tests were run." | |
| exit 1 |
| xvfb-run pytest tests/test_telecom_gp_tc.py tests/test_telecom_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi | ||
| elif [ "${{ env.azure_env_use_case }}" == "IT_helpdesk" ]; then | ||
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi |
There was a problem hiding this comment.
The test execution logic does not handle the case where neither 'telecom' nor 'IT_helpdesk' use cases are specified. If azure_env_use_case has a different value or is empty, no tests will be executed, but the step will silently succeed. This could lead to false positives where the test step appears to pass but no tests were actually run. Consider adding an else clause to handle unknown use cases or adding validation to ensure a valid use case is provided.
| xvfb-run pytest tests/test_telecom_gp_tc.py tests/test_telecom_smoke_tc.py --headed --html=report/report.html --self-contained-html | |
| fi | |
| elif [ "${{ env.azure_env_use_case }}" == "IT_helpdesk" ]; then | |
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | |
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | |
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | |
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | |
| fi | |
| xvfb-run pytest tests/test_telecom_gp_tc.py tests/test_telecom_smoke_tc.py --headed --html=report/report.html --self-contained-html | |
| else | |
| echo "Error: Unsupported test_suite '${{ env.test_suite }}' for azure_env_use_case 'telecom'." | |
| exit 1 | |
| fi | |
| elif [ "${{ env.azure_env_use_case }}" == "IT_helpdesk" ]; then | |
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | |
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | |
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | |
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | |
| else | |
| echo "Error: Unsupported test_suite '${{ env.test_suite }}' for azure_env_use_case 'IT_helpdesk'." | |
| exit 1 | |
| fi | |
| else | |
| echo "Error: Unsupported azure_env_use_case '${{ env.azure_env_use_case }}' with test_suite '${{ env.test_suite }}'." | |
| exit 1 |
| if [ "${{ env.azure_env_use_case }}" == "telecom" ]; then | ||
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | ||
| xvfb-run pytest tests/test_telecom_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_telecom_gp_tc.py tests/test_telecom_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi | ||
| elif [ "${{ env.azure_env_use_case }}" == "IT_helpdesk" ]; then | ||
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi | ||
| fi |
There was a problem hiding this comment.
The test execution logic does not handle the case where neither 'telecom' nor 'IT_helpdesk' use cases are specified. If azure_env_use_case has a different value or is empty, no tests will be executed, but the step will silently succeed. This could lead to false positives where the test step appears to pass but no tests were actually run. Consider adding an else clause to handle unknown use cases or adding validation to ensure a valid use case is provided.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (1)
infra/scripts/add_user_scripts/assign_sql_roles.py:125
- SQL injection vulnerability: The
display_nameandrolevariables are directly interpolated into SQL queries using f-strings without proper sanitization or parameterization. Although these values come from Azure AD, they should still be properly escaped or parameterized to prevent potential SQL injection attacks. Lines 90, 113-118, and 125 all use string interpolation with untrusted input. Consider using parameterized queries with placeholders instead of f-strings for SQL statement construction.
check_user_sql = f"SELECT COUNT(*) FROM sys.database_principals WHERE name = '{display_name}'"
cursor.execute(check_user_sql)
user_exists = cursor.fetchone()[0] > 0
if not user_exists:
try:
if is_service_principal:
# For service principals/managed identities, use SID-based approach
# This doesn't require MS Graph permissions on SQL Server
sid = client_id_to_sid(client_id)
create_user_sql = f"CREATE USER [{display_name}] WITH SID = {sid}, TYPE = E"
else:
# For regular users, use standard external provider approach
create_user_sql = f"CREATE USER [{display_name}] FROM EXTERNAL PROVIDER"
cursor.execute(create_user_sql)
conn.commit()
print(f"✓ Created user: {display_name}")
except Exception as e:
print(f"✗ Failed to create user: {e}")
continue
# Check if user already has the role
check_role_sql = f"""
SELECT COUNT(*)
FROM sys.database_role_members rm
JOIN sys.database_principals rp ON rm.role_principal_id = rp.principal_id
JOIN sys.database_principals mp ON rm.member_principal_id = mp.principal_id
WHERE mp.name = '{display_name}' AND rp.name = '{role}'
"""
cursor.execute(check_role_sql)
has_role = cursor.fetchone()[0] > 0
if not has_role:
# Add user to role
add_role_sql = f"ALTER ROLE [{role}] ADD MEMBER [{display_name}]"
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| str: Hexadecimal SID string for use in CREATE USER statement | ||
| """ | ||
| # Convert the client ID to bytes using UUID | ||
| guid_bytes = uuid.UUID(client_id).bytes_le |
There was a problem hiding this comment.
The client_id_to_sid function uses uuid.UUID(client_id).bytes_le without validating that client_id is a valid UUID format. If an invalid GUID string is passed, this will raise a ValueError. Consider adding try-except error handling or input validation to provide a clearer error message when an invalid client_id is provided, as the current implementation would result in an uncaught exception that might be confusing to debug.
| guid_bytes = uuid.UUID(client_id).bytes_le | |
| try: | |
| guid_bytes = uuid.UUID(client_id).bytes_le | |
| except ValueError as exc: | |
| raise ValueError(f"Invalid client_id '{client_id}': expected a valid UUID/GUID string") from exc |
| if [[ "$BRANCH_NAME" == "main" ]]; then | ||
| IMAGE_TAG="latest" | ||
| echo "Using main branch - image tag: latest" | ||
| IMAGE_TAG="latest_waf" | ||
| echo "Using main branch - image tag: latest_waf" | ||
| elif [[ "$BRANCH_NAME" == "dev" ]]; then | ||
| IMAGE_TAG="dev" | ||
| echo "Using dev branch - image tag: dev" | ||
| elif [[ "$BRANCH_NAME" == "demo" ]]; then | ||
| IMAGE_TAG="demo" | ||
| echo "Using demo branch - image tag: demo" | ||
| elif [[ "$BRANCH_NAME" == "hotfix" ]]; then | ||
| BASE_TAG="hotfix" | ||
| IMAGE_TAG="hotfix" | ||
| echo "Using hotfix branch - image tag: hotfix" | ||
| elif [[ "$BRANCH_NAME" == "dependabotchanges" ]]; then | ||
| BASE_TAG="dependabotchanges" | ||
| IMAGE_TAG="dependabotchanges" | ||
| echo "Using dependabotchanges branch - image tag: dependabotchanges" | ||
| else | ||
| IMAGE_TAG="latest" | ||
| echo "Using default for branch '$BRANCH_NAME' - image tag: latest" | ||
| IMAGE_TAG="latest_waf" | ||
| echo "Using default for branch '$BRANCH_NAME' - image tag: latest_waf" |
There was a problem hiding this comment.
The image tag naming has changed from "latest" to "latest_waf" for main and default branches. This is a breaking change if there are existing deployments or scripts that rely on the "latest" tag. Ensure that all deployment documentation, external scripts, and CI/CD pipelines that reference the "latest" tag are updated to use "latest_waf" instead.
| if ! [[ "18.04 20.04 22.04 24.04 24.10" == *"$(grep VERSION_ID /etc/os-release | cut -d '"' -f 2)"* ]]; | ||
| then | ||
| echo "Ubuntu $(grep VERSION_ID /etc/os-release | cut -d '"' -f 2) is not currently supported."; | ||
| exit 1; | ||
| fi | ||
|
|
||
| # Download the package to configure the Microsoft repo | ||
| curl -fsSL -O https://packages.microsoft.com/config/ubuntu/$(grep VERSION_ID /etc/os-release | cut -d '"' -f 2)/packages-microsoft-prod.deb || { echo "Failed to download Microsoft packages config"; exit 1; } |
There was a problem hiding this comment.
The Ubuntu version check uses string matching with spaces between version numbers. This approach works but could be fragile. If a version string contains extra whitespace or doesn't match exactly, it will fail. Consider using a more robust version comparison approach, such as regex matching or explicit version checks (e.g., VERSION_ID in ("18.04" "20.04" "22.04" "24.04" "24.10")).
| if ! [[ "18.04 20.04 22.04 24.04 24.10" == *"$(grep VERSION_ID /etc/os-release | cut -d '"' -f 2)"* ]]; | |
| then | |
| echo "Ubuntu $(grep VERSION_ID /etc/os-release | cut -d '"' -f 2) is not currently supported."; | |
| exit 1; | |
| fi | |
| # Download the package to configure the Microsoft repo | |
| curl -fsSL -O https://packages.microsoft.com/config/ubuntu/$(grep VERSION_ID /etc/os-release | cut -d '"' -f 2)/packages-microsoft-prod.deb || { echo "Failed to download Microsoft packages config"; exit 1; } | |
| UBUNTU_VERSION=$(grep '^VERSION_ID=' /etc/os-release | cut -d '"' -f 2) | |
| case "$UBUNTU_VERSION" in | |
| 18.04|20.04|22.04|24.04|24.10) | |
| # Supported versions | |
| ;; | |
| *) | |
| echo "Ubuntu $UBUNTU_VERSION is not currently supported."; | |
| exit 1; | |
| ;; | |
| esac | |
| # Download the package to configure the Microsoft repo | |
| curl -fsSL -O https://packages.microsoft.com/config/ubuntu/${UBUNTU_VERSION}/packages-microsoft-prod.deb || { echo "Failed to download Microsoft packages config"; exit 1; } |
| if not user_exists: | ||
| # Create user from external provider | ||
| create_user_sql = f"CREATE USER [{display_name}] FROM EXTERNAL PROVIDER" | ||
| try: | ||
| if is_service_principal: | ||
| # For service principals/managed identities, use SID-based approach | ||
| # This doesn't require MS Graph permissions on SQL Server | ||
| sid = client_id_to_sid(client_id) | ||
| create_user_sql = f"CREATE USER [{display_name}] WITH SID = {sid}, TYPE = E" | ||
| else: | ||
| # For regular users, use standard external provider approach | ||
| create_user_sql = f"CREATE USER [{display_name}] FROM EXTERNAL PROVIDER" | ||
|
|
||
| cursor.execute(create_user_sql) | ||
| conn.commit() | ||
| print(f"✓ Created user: {display_name}") |
There was a problem hiding this comment.
The SQL queries throughout this function use string formatting with user-supplied inputs (display_name, role) which creates SQL injection vulnerabilities. While checking for user existence at line 90-92 and role membership at lines 113-119, as well as in the ALTER ROLE statement at line 125, the display_name and role values are directly interpolated into SQL queries. Since these are DDL statements where parameterization is not supported by pyodbc, you should validate and sanitize the inputs. Consider using a whitelist for role names and validating that display_name doesn't contain special characters like quotes or SQL keywords.
| upload_with_retry() { | ||
| local source_folder="$1" | ||
| local dest_path="$2" | ||
| local description="$3" | ||
| local upload_retries=5 | ||
| local upload_attempt=0 | ||
|
|
||
| while [ $upload_attempt -lt $upload_retries ]; do | ||
| if az storage blob upload-batch \ | ||
| --account-name "$storageAccountName" \ | ||
| --destination "$dest_path" \ | ||
| --source "$source_folder" \ | ||
| --auth-mode login \ | ||
| --pattern '*' \ | ||
| --overwrite \ | ||
| --output none 2>/dev/null; then | ||
| echo "✓ Uploaded $description successfully" | ||
| return 0 | ||
| fi | ||
| upload_attempt=$((upload_attempt + 1)) | ||
| echo " Upload attempt $upload_attempt/$upload_retries failed - waiting 15 seconds..." | ||
| sleep 15 | ||
| done | ||
| echo "✗ Failed to upload $description after $upload_retries attempts" | ||
| return 1 | ||
| } |
There was a problem hiding this comment.
The retry logic uses hardcoded sleep values (15 seconds between upload attempts). While this is functional, consider making this configurable or at least documenting why 15 seconds was chosen. Additionally, the upload errors are suppressed with 2>/dev/null. For better debugging, consider showing the error message on the final failed attempt to help diagnose upload issues.
| # Determine isServicePrincipal based on account type | ||
| # When running as servicePrincipal, use SID-based approach | ||
| # When running as user, use FROM EXTERNAL PROVIDER | ||
| if [ "$account_type" == "servicePrincipal" ]; then | ||
| is_sp="true" | ||
| else | ||
| is_sp="false" | ||
| fi | ||
|
|
||
| # Managed identity role assignments | ||
| roles_json="[{\"clientId\":\"$backendManagedIdentityClientId\",\"displayName\":\"$mi_display_name\",\"role\":\"db_datareader\",\"isServicePrincipal\":$is_sp},{\"clientId\":\"$backendManagedIdentityClientId\",\"displayName\":\"$mi_display_name\",\"role\":\"db_datawriter\",\"isServicePrincipal\":$is_sp}]" |
There was a problem hiding this comment.
The isServicePrincipal flag is set based on the account type of the executing identity (user vs service principal), but it's being applied to the managed identity being created in SQL. Managed identities should always use the SID-based approach regardless of whether the script is executed by a user or service principal. The flag should be set to "true" for managed identity creation since managed identities are a type of service principal. The current logic conflates the execution context with the identity type being created.
| if [ "${{ env.azure_env_use_case }}" == "telecom" ]; then | ||
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | ||
| xvfb-run pytest tests/test_telecom_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_telecom_gp_tc.py tests/test_telecom_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi | ||
| elif [ "${{ env.azure_env_use_case }}" == "IT_helpdesk" ]; then | ||
| if [ "${{ env.test_suite }}" == "GoldenPath-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py --headed --html=report/report.html --self-contained-html | ||
| elif [ "${{ env.test_suite }}" == "Smoke-Testing" ]; then | ||
| xvfb-run pytest tests/test_ithelpdesk_gp_tc.py tests/test_ithelpdesk_smoke_tc.py --headed --html=report/report.html --self-contained-html | ||
| fi | ||
| fi |
There was a problem hiding this comment.
There is significant code duplication in the test execution logic across test1, test2, and test3 steps. The same nested if-elif structure is repeated three times (lines 129-141, 154-166, 179-191). Consider extracting this logic into a reusable script or using a GitHub Actions composite action to reduce duplication and improve maintainability.
| elif [ "$account_type" == "servicePrincipal" ]; then | ||
| # Running as a service principal - get SP object ID and display name | ||
| client_id=$(az account show --query user.name --output tsv 2>/dev/null) | ||
| if [ -n "$client_id" ]; then | ||
| sp_info=$(az ad sp show --id "$client_id" --query "{id:id, displayName:displayName}" -o json 2>&1) | ||
| if [ $? -ne 0 ]; then | ||
| echo "✗ Failed to retrieve service principal information for client ID: $client_id" | ||
| echo "$sp_info" | ||
| exit 1 | ||
| fi | ||
| signed_user_id=$(echo "$sp_info" | grep -o '"id": *"[^"]*"' | head -1 | sed 's/"id": *"\([^"]*\)"/\1/') | ||
| signed_user_display_name=$(echo "$sp_info" | grep -o '"displayName": *"[^"]*"' | sed 's/"displayName": *"\([^"]*\)"/\1/') | ||
| fi | ||
| if [ -z "$signed_user_id" ] || [ -z "$signed_user_display_name" ]; then | ||
| echo "✗ Failed to get service principal information" | ||
| exit 1 | ||
| fi | ||
| echo "✓ Running as service principal: $signed_user_display_name ($signed_user_id)" |
There was a problem hiding this comment.
In the service principal branch (line 52-69), if client_id is empty after retrieving it on line 54, the code proceeds to check if signed_user_id is empty on line 65. However, the inner block that populates sp_info (lines 55-64) only executes if client_id is not empty. This means if client_id is empty initially, the code will reach line 65 with undefined signed_user_id and signed_user_display_name, properly failing with the error message. However, the logic could be clearer by checking for empty client_id immediately after line 54 and exiting early, rather than relying on the nested structure.
| if [ -n "$client_id" ]; then | ||
| signed_user_id=$(az ad sp show --id "$client_id" --query id --output tsv 2>&1) | ||
| # Check if the command failed or returned an empty/erroneous ID | ||
| if [ $? -ne 0 ] || [ -z "$signed_user_id" ] || [[ "$signed_user_id" == *"ERROR"* ]]; then | ||
| echo "✗ Failed to get service principal object ID using client ID: $client_id" | ||
| echo "Azure CLI output:" | ||
| echo "$signed_user_id" | ||
| exit 1 | ||
| fi | ||
| else | ||
| echo "✗ Failed to get service principal client ID" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Similar to the issue in run_create_index_scripts.sh, the service principal branch could be clearer. If client_id is empty at line 28, the code proceeds to the outer check at line 32. However, if client_id is not empty but the az ad sp show command fails (line 30), the error handling is appropriate. Consider checking for empty client_id immediately after line 28 and exiting early for better code clarity.
| if [ -n "$client_id" ]; then | |
| signed_user_id=$(az ad sp show --id "$client_id" --query id --output tsv 2>&1) | |
| # Check if the command failed or returned an empty/erroneous ID | |
| if [ $? -ne 0 ] || [ -z "$signed_user_id" ] || [[ "$signed_user_id" == *"ERROR"* ]]; then | |
| echo "✗ Failed to get service principal object ID using client ID: $client_id" | |
| echo "Azure CLI output:" | |
| echo "$signed_user_id" | |
| exit 1 | |
| fi | |
| else | |
| echo "✗ Failed to get service principal client ID" | |
| exit 1 | |
| fi | |
| if [ -z "$client_id" ]; then | |
| echo "✗ Failed to get service principal client ID" | |
| exit 1 | |
| fi | |
| signed_user_id=$(az ad sp show --id "$client_id" --query id --output tsv 2>&1) | |
| # Check if the command failed or returned an empty/erroneous ID | |
| if [ $? -ne 0 ] || [ -z "$signed_user_id" ] || [[ "$signed_user_id" == *"ERROR"* ]]; then | |
| echo "✗ Failed to get service principal object ID using client ID: $client_id" | |
| echo "Azure CLI output:" | |
| echo "$signed_user_id" | |
| exit 1 | |
| fi |
| # Wait for role assignment to propagate by testing storage access | ||
| echo "⏳ Waiting for role assignment to propagate..." | ||
| max_retries=30 | ||
| retry_count=0 | ||
| while [ $retry_count -lt $max_retries ]; do | ||
| if az storage container list --account-name "$storageAccountName" --auth-mode login --output none 2>/dev/null; then | ||
| echo "✓ Role assignment propagated successfully" | ||
| break | ||
| fi | ||
| retry_count=$((retry_count + 1)) | ||
| echo " Attempt $retry_count/$max_retries - waiting 10 seconds..." | ||
| sleep 10 | ||
| done | ||
|
|
||
| if [ $retry_count -eq $max_retries ]; then | ||
| echo "✗ Role assignment did not propagate within expected time" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
The role assignment propagation wait loop uses a fixed 30-retry limit with 10-second intervals (up to 5 minutes total). While this is reasonable, consider that in Azure, role assignments can occasionally take longer to propagate. Additionally, the error messages from the failed az storage container list command are being suppressed with 2>/dev/null. Consider logging at least the final attempt's output to aid in debugging if the propagation fails.
Purpose
This pull request introduces several improvements and fixes across the deployment workflows, infrastructure scripts, and automation logic. The main focus is on enhancing deployment reliability, supporting both user and service principal scenarios, improving test automation flexibility, and ensuring better output handling and environment configuration.
Key changes include:
Deployment Workflow Enhancements
API_APP_URLandWEB_APP_URL) to use directazd env get-valuecommands for more robust and portable output handling. [1] [2]Infrastructure and Identity Support
Test Automation Improvements
azure_env_use_caseandtest_suite), allowing for targeted test execution for different use cases (e.g., telecom, IT helpdesk). [1] [2] [3]Workflow Logic and Output Adjustments
DOCKER_BUILD_SUMMARYenvironment variable tofalsefor Docker build steps to suppress verbose output. [1] [2]latest_wafinstead oflatestin Azure deploy jobs.Infrastructure Parameterization
usecaseparameter to infrastructure parameter files to support environment-specific configuration. [1] [2]Does this introduce a breaking change?
Golden Path Validation
Deployment Validation