Skip to content

Add new auth function in parallel upgraded to dotnet 8 #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 80 commits into from
Mar 24, 2025

Conversation

adamtry
Copy link
Contributor

@adamtry adamtry commented Mar 13, 2025

I modified Adams's deployment strategy for the following reasons:

  1. We cannot redeploy the original 3.1 version, so there is little benefit in keeping the code. By changing the service name, a new lambda will be created and the existing one will remain
  2. I didn't know what code had been modified when it had been copied, making this PR more difficult

Policies

Most of the policies were added manually via the console. I have added these policies to the Serverless template.

The deployment needs to work for both staging and production environments (there is a dev environment, but I don't think it's used). I've included the accountIds for the HousingProd and HousingStaging accounts.

custom:
  housingAccountIds:
    staging: "087586271961"
    production: "282997303675"

We need a policy for the authorizer to be called from each AWS account. The first policy is for the StagingAPIs or ProductionAPIs account. The second is for HousingStaging and HousingProduction. If there are more AWS accounts required, they will need to be added here.

    AllowApisAccountAuthorizer:
      # ProductionApis or StagingApis
      Type: AWS::Lambda::Permission
      Properties:
        FunctionName:
          Fn::GetAtt:
            - ApiAuthVerifyTokenNewLambdaFunction
            - Arn
        Action: lambda:InvokeFunction
        Principal: apigateway.amazonaws.com
        SourceArn:
          Fn::Join:
            - ""
            - - "arn:aws:execute-api:"
              - Ref: AWS::Region
              - ":"
              - Ref: AWS::AccountId
              - ":*"
    AllowHousingCrossAccountAuthorizer:
      # Housing-Production or Housing-Staging
      Type: AWS::Lambda::Permission
      Properties:
        FunctionName:
          Fn::GetAtt:
            - ApiAuthVerifyTokenNewLambdaFunction
            - Arn
        Action: lambda:InvokeFunction
        Principal: apigateway.amazonaws.com
        SourceArn:
          Fn::Join:
            - ""
            - - "arn:aws:execute-api:"
              - Ref: AWS::Region
              - ":"
              - ${self:custom.housingAccountIds.${self:provider.stage}}
              - ":*/authorizers/*"

Additionally, we need a policy for each account to assume the role when calling the GetRestApiAsync method in ApiGateway. This policy might be redundant if we remove the call to API Gateway in a later PR.

- PolicyName: assumeRoleForGettingCredentialsApiAccount
            PolicyDocument:
              Version: "2012-10-17"
              Statement:
                - Effect: Allow
                  Action:
                    - "sts:AssumeRole"
                  Resource:
                    Fn::Join:
                      - ""
                      - - "arn:aws:iam::"
                        - Ref: "AWS::AccountId"
                        - ":role/LBH_Api_Gateway_Allow_GET"
          - PolicyName: assumeRoleForGettingCredentialsHousingAccount
            PolicyDocument:
              Version: "2012-10-17"
              Statement:
                - Effect: Allow
                  Action:
                    - "sts:AssumeRole"
                  Resource:
                    Fn::Join:
                      - ""
                      - - "arn:aws:iam::"
                        - ${self:custom.housingAccountIds.${self:provider.stage}}
                        - ":role/LBH_Api_Gateway_Allow_GET"                        

Updating the ValidateToken Method

When testing the ValidateToken method, I repeatedly encountered the following error IDX10503: Signature validation failed. Token does not have a kid. Keys tried: .... I'm unsure if this change was due to updating the JWT library or upgrading to DOTNET 8, but I could not find a workaround. Our JWT tokens don't contain a KID header, and there was no way to disable the check without turning off signature verification entirely.

For that reason, I've taken the approach to use the JWT library (https://www.nuget.org/packages/JWT). It enables us to verify the token in the same way, but without requiring the missing KID header.

Previous PR Description

Context: We need to update the auth-verify-token function to dotnet 8 due to a deadline where dotnet 6 lambda functions will no longer be deployable on AWS. We will need this function to be deployable in a short time-frame so that we can fix a timeout issue in the VerifyAccessUseCase _awsApiGateway.GetApiName that is suspected to be surpassing account-level usage limits.

This verify-token function is used by many of our APIs across multiple AWS accounts, so any errors could cause serious widespread disruption. For this reason it was decided to deploy the dotnet 8 version of this token verifier in parallel with the existing one, and phasing the old one out more carefully after testing with the new one in a limited way.

The main Files changed will be somewhat difficult to read through because the original ApiAuthVerifyToken and ApiAuthVerifyToken.Tests folders were copied into new folders ApiAuthVerifyTokenNew and ApiAuthVerifyTokenNew.Tests so that we can deploy the dotnet 8 version of this function in parallel and keep the original deployment and tests as-is.

After the dotnet 8 function is in production and well-tested, it's assumed that a task will be taken to remove it from this code base. The new solutions are parallel to the existing ones so that the old should be removable without affecting the new.

See a7619bd for the actual changes to get dotnet 8 and parallel deployment working

The dotnet 8 upgrade steps were mostly copied from an existing dotnet 8 upgrade PR that was before we decided to deploy in parallel: #65 - see this for further comments and explanations

Note: It is expected that the existing 3.1 deployment will fail because deploying 3.1 isn't supported anymore, but at least it shouldn't try to destroy the existing function. The dotnet 8 deployment is in parallel with this in the CircleCI workflow so should not be blocked

steps:
- deploy-lambda:
stage: "staging"
path: "./ApiAuthVerifyTokenNew/"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only deploy the new one to staging for now. I didn't want to risk accidental production deployments when I don't yet know how it will interact with other infrastructure

Environment.SetEnvironmentVariable("jwtSecret", _fixture.Create<string>());
Environment.SetEnvironmentVariable("hackneyUserAuthTokenJwtSecret", _faker.Random.AlphaNumeric(25));
Environment.SetEnvironmentVariable("jwtSecret", _faker.Random.AlphaNumeric(50));
Environment.SetEnvironmentVariable("hackneyUserAuthTokenJwtSecret", _faker.Random.AlphaNumeric(50));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the new requirements in dotnet 8 that these secrets must be longer. See https://github.com/AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet/wiki/IDX10720

@adamtry adamtry marked this pull request as ready for review March 13, 2025 15:46
@adamtry adamtry requested a review from a team as a code owner March 13, 2025 15:46
@adamtry adamtry changed the title Upgrade to dotnet8 parallel Add new auth function in parallel with dotnet 8 Mar 13, 2025
@adamtry adamtry changed the title Add new auth function in parallel with dotnet 8 Add new auth function in parallel upgraded to dotnet 8 Mar 13, 2025
LBHCallumM
LBHCallumM previously approved these changes Mar 13, 2025
@martapederiva
Copy link

@LBHCallumM
abraxas-lotr

martapederiva
martapederiva previously approved these changes Mar 13, 2025
@LBHCallumM
Copy link
Member

@LBHCallumM abraxas-lotr abraxas-lotr

:(

Copy link

@Duslerke Duslerke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a temporary test (because let's face it - that's what it is - a spike to see whether the upgrade would work), you don't need to commit this to master branch source control.

  1. Just do these changes to the existing project files (no need for dupes) on a spike dummy/test branch where you'll tweak the pipeline to deploy from your spike branch instead of master.
  2. Make sure to change the CF stack name in the serverless file so it gets deployed as a separate stack.
  3. Deploy a temp version of dotnet 8 lambda under a new stack.
  4. Once you're done testing, you can raise the PR from your spike branch onto the main branch after you've tweaked back the pipeline to deploy as normal.

This avoids the messy git history, excessive duplicated files, hard-to-read PR diff, and potentially forever lingering confusing suffixes like -new here or -2 in housing-repairs-online-2.

Lastly, if you'll notice the CircleCI pipeline, to deploy to development environment, you need the development branch. Your PR is targeting master. You should fix the development branch first and merge whatever PR to this branch first, otherwise you'd be skipping updating development environment and going straight to staging. Huge nono.

Doing 'request changes' only to pull some attention to the comment given 2 approvals are already present.

Duslerke
Duslerke previously approved these changes Mar 17, 2025
Copy link

@Duslerke Duslerke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock it because of reported lack of time and will for this to be re-done via the proper approach mentioned in the previous review (as this piece of work is taken over due to original author being on leave).

@LBHCallumM LBHCallumM dismissed stale reviews from Duslerke, martapederiva, and themself via 5fdc60d March 17, 2025 10:30
Duslerke
Duslerke previously approved these changes Mar 21, 2025
Copy link

@Duslerke Duslerke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems alright at a glance.
Nice switch to a different JWT.

Comment on lines 152 to 161
check:
jobs:
- check-code-formatting:
filters:
branches:
ignore: master
- build-and-test:
filters:
branches:
ignore: master

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore master and development

Comment on lines 192 to 211

temp-deploy-new-staging-for-testing-purposes:
jobs:
- build-and-test
- assume-role-staging:
context: api-assume-role-staging-context
requires:
- build-and-test
- permit-staging-release:
type: approval
requires:
- assume-role-staging
- terraform-init-and-apply-to-staging:
requires:
- permit-staging-release
- deploy-to-staging:
context:
- "Serverless Framework"
requires:
- terraform-init-and-apply-to-staging

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR for this temp deploy, are you planning to move onto live now?
Wondering whether this configuration is something leftover from your testing.

EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ApiAuthVerifyToken.Tests", "ApiAuthVerifyToken.Tests\ApiAuthVerifyToken.Tests.csproj", "{DE8E4E61-EB76-416B-8417-E8F2F6A0C0D3}"
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "ApiAuthVerifyToken.Tests", "ApiAuthVerifyToken.Tests\ApiAuthVerifyToken.Tests.csproj", "{DE8E4E61-EB76-416B-8417-E8F2F6A0C0D3}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project GUID somehow changed?

@adamtry adamtry changed the base branch from master to development March 24, 2025 12:44
@adamtry adamtry requested a review from a team March 24, 2025 12:56
@adamtry
Copy link
Contributor Author

adamtry commented Mar 24, 2025

We cannot redeploy the original 3.1 version, so there is little benefit in keeping the code. By changing the service name, a new lambda will be created and the existing one will remain
I didn't know what code had been modified when it had been copied, making this PR more difficult
Policies
Most of the policies were added manually via the console. I have added these policies to the Serverless template.

The deployment needs to work for both staging and production environments (there is a dev environment, but I don't think it's used). I've included the accountIds for the HousingProd and HousingStaging accounts.

custom:
housingAccountIds:
staging: "087586271961"
production: "282997303675"
We need a policy for the authorizer to be called from each AWS account. The first policy is for the StagingAPIs or ProductionAPIs account. The second is for HousingStaging and HousingProduction. If there are more AWS accounts required, they will need to be added here.

AllowApisAccountAuthorizer:
  # ProductionApis or StagingApis
  Type: AWS::Lambda::Permission
  Properties:
    FunctionName:
      Fn::GetAtt:
        - ApiAuthVerifyTokenNewLambdaFunction
        - Arn
    Action: lambda:InvokeFunction
    Principal: apigateway.amazonaws.com
    SourceArn:
      Fn::Join:
        - ""
        - - "arn:aws:execute-api:"
          - Ref: AWS::Region
          - ":"
          - Ref: AWS::AccountId
          - ":*"
AllowHousingCrossAccountAuthorizer:
  # Housing-Production or Housing-Staging
  Type: AWS::Lambda::Permission
  Properties:
    FunctionName:
      Fn::GetAtt:
        - ApiAuthVerifyTokenNewLambdaFunction
        - Arn
    Action: lambda:InvokeFunction
    Principal: apigateway.amazonaws.com
    SourceArn:
      Fn::Join:
        - ""
        - - "arn:aws:execute-api:"
          - Ref: AWS::Region
          - ":"
          - ${self:custom.housingAccountIds.${self:provider.stage}}
          - ":*/authorizers/*"

Additionally, we need a policy for each account to assume the role when calling the GetRestApiAsync method in ApiGateway. This policy might be redundant if we remove the call to API Gateway in a later PR.

  • PolicyName: assumeRoleForGettingCredentialsApiAccount
    PolicyDocument:
    Version: "2012-10-17"
    Statement:
    - Effect: Allow
    Action:
    - "sts:AssumeRole"
    Resource:
    Fn::Join:
    - ""
    - - "arn:aws:iam::"
    - Ref: "AWS::AccountId"
    - ":role/LBH_Api_Gateway_Allow_GET"
    - PolicyName: assumeRoleForGettingCredentialsHousingAccount
    PolicyDocument:
    Version: "2012-10-17"
    Statement:
    - Effect: Allow
    Action:
    - "sts:AssumeRole"
    Resource:
    Fn::Join:
    - ""
    - - "arn:aws:iam::"
    - ${self:custom.housingAccountIds.${self:provider.stage}}
    - ":role/LBH_Api_Gateway_Allow_GET"
    Updating the ValidateToken Method
    When testing the ValidateToken method, I repeatedly encountered the following error IDX10503: Signature validation failed. Token does not have a kid. Keys tried: .... I'm unsure if this change was due to updating the JWT library or upgrading to DOTNET 8, but I could not find a workaround. Our JWT tokens don't contain a KID header, and there was no way to disable the check without turning off signature verification entirely.

For that reason, I've taken the approach to use the JWT library (https://www.nuget.org/packages/JWT). It enables us to verify the token in the same way, but without requiring the missing KID header.

Previous PR Description
Context: We need to update the auth-verify-token function to dotnet 8 due to a deadline where dotnet 6 lambda functions will no longer be deployable on AWS. We will need this function to be deployable in a short time-frame so that we can fix a timeout issue in the VerifyAccessUseCase _awsApiGateway.GetApiName that is suspected to be surpassing account-level usage limits.

This verify-token function is used by many of our APIs across multiple AWS accounts, so any errors could cause serious widespread disruption. For this reason it was decided to deploy the dotnet 8 version of this token verifier in parallel with the existing one, and phasing the old one out more carefully after testing with the new one in a limited way.

The main Files changed will be somewhat difficult to read through because the original ApiAuthVerifyToken and ApiAuthVerifyToken.Tests folders were copied into new folders ApiAuthVerifyTokenNew and ApiAuthVerifyTokenNew.Tests so that we can deploy the dotnet 8 version of this function in parallel and keep the original deployment and tests as-is.

After the dotnet 8 function is in production and well-tested, it's assumed that a task will be taken to remove it from this code base. The new solutions are parallel to the existing ones so that the old should be removable without affecting the new.

See a7619bd for the actual changes to get dotnet 8 and parallel deployment working

The dotnet 8 upgrade steps were mostly copied from an existing dotnet 8 upgrade PR that was before we decided to deploy in parallel: #65 - see this for further comments and explanations

Note: It is expected that the existing 3.1 deployment will fail because deploying 3.1 isn't supported anymore, but at least it shouldn't try to destroy the existing function. The dotnet 8 deployment is in parallel with this in the CircleCI workflow so should not be blocked

@adamtry adamtry merged commit 8ffe58d into development Mar 24, 2025
3 checks passed
@adamtry adamtry deleted the upgrade-to-dotnet8-parallel branch March 27, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants