-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DCJ-386] Register Azure Managed Resource Group with SAM #1697
base: develop
Are you sure you want to change the base?
[DCJ-386] Register Azure Managed Resource Group with SAM #1697
Conversation
91829b7
to
4f259e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. Not sure how to debug the test failure, but once that is sorted out, 👍🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this so far! Happy to discuss more, maybe in mobbing this afternoon?
|
||
public void registerManagedResourceGroup( | ||
BillingProfileRequestModel request, AuthenticatedUserRequest user) { | ||
String billingProfileId = request.getBillingAccountId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A BillingProfileRequestModel
only has a billingAccountId
supplied when a GCP billing profile is being created -- it's the ID of the GCP billing account to wrap. For Azure billing profile creation, it is unspecified. I suspect this is why your AzureIntegrationTests
are failing with this error:
https://scans.gradle.com/s/da5olvhqbaima/tests/overview?outcome=FAILED
java.lang.AssertionError: Error validating azure billing profile create. Got response: bio.terra.integration.DataRepoResponse@193cb0e9[response=bio.terra.integration.ObjectOrErrorResponse@5de6abf3[statusCode=500 INTERNAL_SERVER_ERROR,locationHeader=Optional.empty,errorObject=Optional[class ErrorModel { |
-- | --
| message: Message: Missing the required parameter 'billingProfileId' when calling createManagedResourceGroup(Async)
What this endpoint is expecting is the ID of the Sam spend-profile
resource created in this flight. I believe that it needs the Sam resource to already exist. Do you know where that Sam resource creation takes place in the flight?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateProfileAuthzIamStep
seems to ultimately call samResourceApi.createResourceV2(IamResourceType.SPEND_PROFILE.toString(), req);
. req
uses request.getId().toString()
as the profileId
. However, .getId()
is deprecated, so this is a bit confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it's a bit confusing!
Here's the past PR where we deprecated this ID in the request, with more context on the history: #1370
The first step in the billing profile creation flight will generate a UUID for the sam-profile
and stores it in the request object if one doesn't already exist:
jade-data-repo/src/main/java/bio/terra/service/profile/flight/create/GetOrCreateProfileIdStep.java
Lines 18 to 21 in d8d53ce
UUID profileId = request.getId(); | |
if (profileId == null) { | |
request.setId(UUID.randomUUID()); | |
} |
But the ID is only supplied to Sam to create the resource in Sam later in the flight, which looks to happen in CreateProfileAuthzIamStep
.
@Override | ||
public StepResult undoStep(FlightContext flightContext) throws InterruptedException { | ||
// Registering the Managed Resource Group has no side effects to clean up | ||
return StepResult.getStepResultSuccess(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There actually is a side effect to clean up: you can call Sam to delete the MRG registration. Here's that endpoint:
https://sam.dsde-prod.broadinstitute.org/#/Azure/deleteManagedResourceGroup
But you could ask in #dsp-identiteam if deleting the Sam profile resource would also delete any MRG registration associated with that billing profile. If that's the case, then a no-op undoStep
here would be acceptable, because undoing the step which deletes the Sam profile resource would take care of this clean-up for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a larger problem with making thedeleteManagedResourceGroup
call, which is that it must be called before deleting the billing profile in Sam (according to Doug). This means that the cleanup to CreateProfileAuthzIamStep
will cause conflicts with Managed Resource Group undo step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a problem, but we do need to change where this step runs.
The createManagedResourceGroup
endpoint must be called after creating the billing profile, and deleteManagedResourceGroup
must be called before deleting the billing profile. So, this step must run after CreateProfileAuthzIamStep
.
Then when a flight is moving forward:
CreateProfileAuthzIamStep.doStep
creates the profileCreateProfileManagedResourceGroupStep.doStep
creates the association between profile and MRG coords
If a flight is rolled back, it walks backwards through the steps to undo:
CreateProfileManagedResourceGroupStep.undoStep
deletes the association between profile and MRG coordsCreateProfileAuthzIamStep.undoStep
deletes the profile
new ManagedResourceGroupCoordinates() | ||
.tenantId(request.getTenantId().toString()) | ||
.subscriptionId(request.getSubscriptionId().toString()) | ||
.managedResourceGroupName(request.getResourceGroupName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had left this remark in the ticket:
The format of the MRG coordinates that TDR billing profile creation expects from the caller may not match exactly the format expected by the Sam endpoint.
I was specifically wondering if TDR's BillingProfileRequestModel.resourceGroupName
is the expected value for managedResourceGroupName
. In TDR Azure architecture, the user creates a Resource Group in which to deploy their TDR managed app. The TDR managed app manages a new Managed Resource Group. I think TDR billing profile creation takes in the user-created Resource Group name, but the Sam endpoint takes in Managed Resource Group name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure how to address this issue. The request
doesn't seem to contain the "real" Managed Resource Group name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Azure service has the following:
var properties = objectMapper.valueToTree(applicationDeployment.properties());
var managedResourceGroupId = properties.get(MANAGED_RESOURCE_GROUP_ID_KEY)
I'm not sure if this is safe to access from the running steps, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are on the right track. The request body may not contain the MRG name, but this is something TDR can obtain through interactions with the Azure SDK… and the snippet you found shows that there is precedence elsewhere in the code for doing just that.
(BTW -- where possible to link to snippets from the existing codebase, that would be much appreciated to make sure that we're looking at the same thing, and leave a trail of breadcrumbs for those perusing this PR in the future.)
The snippet you shared looks to come from AzureApplicationDeploymentService.newApplicationDeployment
which is called only in AzureApplicationDeploymentService.getOrRegisterApplicationDeployment
.
Line 88 in 3198570
return newApplicationDeployment(billingProfile); |
getOrRegisterApplicationDeployment
is presently only called when creating datasets and snapshots, as part of getting or creating their Azure storage accounts.
I may be missing historical reasoning, but it makes more sense to me to register an application deployment as part of creating an Azure billing profile, as a step within the flight that can also delete the application deployment registration in its undoStep
definition. Then the remainder of the flight can rely on having the MRG name available.
Any future attempts to create billing profiles from the same MRG would fail fast at billing profile creation rather than later on at dataset creation, which seems to be the desired effect:
Lines 75 to 81 in 3198570
throw new MismatchedBillingProfilesException( | |
"Cannot reuse existing application deployment " | |
+ applicationResource.getAzureApplicationDeploymentName() | |
+ " from profile " | |
+ applicationResource.getProfileId() | |
+ " with a different profile " | |
+ billingProfile.getId()); |
@Override | ||
public StepResult doStep(FlightContext flightContext) | ||
throws InterruptedException, RetryException { | ||
profileService.registerManagedResourceGroup(request, user); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had left this question in the ticket:
What is the Sam endpoint’s behavior when a billing profile already exists for the MRG coordinates?
Looking at the documented responses, it will return a 409: https://sam.dsde-prod.broadinstitute.org/#/Azure/createManagedResourceGroup
Flagging this as I'm not sure if this will end up proving problematic in the current setup for our AzureIntegrationTest
s. Let's revisit that after the next round of changes.
Quality Gate passedIssues Measures |
Addresses
https://broadworkbench.atlassian.net/browse/DCJ-386
Summary
This adds the step to register the Azure managed resource group in SAM stairway step with unit tests.