Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow markup projects to request x-content-source-authorization #431

Open
auniverseaway opened this issue Jan 18, 2025 · 15 comments
Open

Allow markup projects to request x-content-source-authorization #431

auniverseaway opened this issue Jan 18, 2025 · 15 comments

Comments

@auniverseaway
Copy link
Member

As a markup provider, I would like Sidekick to tell helix-admin to pass the auth header so that I can lock down my source content.

Additional context

helix-admin has the ability to accept a x-content-source-authorization header. This header tells helix-admin to pass the auth token to the content source. As of today, Sidekick does not have a flag to tell Helix Admin to do this.

Criteria of acceptance

  1. Sidekick passes the x-content-source-authorization header to helix admin for select projects.

Solution proposal

{
  "project": "My DA Project",
  "contentSourceAuth": true,
}

Not married to the name above, so whatever you feel makes sense, @rofe @dylandepass.

@dylandepass
Copy link
Member

@auniverseaway Where are you proposing the token would come from?

@rofe
Copy link
Contributor

rofe commented Jan 20, 2025

@dylandepass I think the token is stored in the site config. This flag is just to tell the admin service to send it along.

What I don't quite understand is why we need this flag... can't the admin service just always send the token if configured?

cc @tripodsan

@dylandepass
Copy link
Member

Ok, I didn't realize we are using secrets for this. I was under the impression that this header was just passed to admin requests and not stored on our end.

@auniverseaway
Copy link
Member Author

What I don't quite understand is why we need this flag... can't the admin service just always send the token if configured?

I like this idea. I don't think there is currently a path to do this, but it would make more sense to me if we used config bus to turn this flag on. Then every tool that uses helix-admin does not have to be aware of this header.


More color on the original proposal...

There's two parts:

  1. Token
  2. Header

Token

The markup editor (AEM, DA, etc.) sends a bearer token to helix-admin directly when they request preview and publish directly.

Sidekick currently does the same: asking the user to login (deferring identity to helix-admin when then defers to the configured IDP) and then passes the token to helix-admin.

At this point, we have a token, but helix-admin requires the preview request to pass the header x-content-source-authorization which then signals it to send an authorization token to the source.

By adding the header to sidekick, helix-admin will pass the token to us and we can grant viewing the content for ingestion into helix.

Looking at the tests, it almost looks like helix-admin doesn't use the supplied user token and instead uses an agreed upon token. This is less ideal as we want to know the user requesting the preview, but we can work with it.

@dylandepass
Copy link
Member

dylandepass commented Jan 21, 2025

Ok, that was how I thought things were working today. I agree that having admin automatically pass the x-content-source-authorization header to html2md would make the most sense (if the token is setup in config service and the user has the correct role).

Given that, I think this issue should start on either helix-config-storage or helix-admin but I will defer to @tripodsan for his thoughts.

One question though, arn't these tokens in some cases (AEM) short lived and therefor don't make sense in long term storage?

@auniverseaway
Copy link
Member Author

One question though, aren't these tokens in some cases (AEM) short lived and therefor don't make sense in long term storage?

@dylandepass I would assume so. Our (DA's / Author Bus) preference would be to take the bearer token from the user that was given to helix-admin and round trip it back to the content provider. This solves two problems:

  1. No need to store a (short lived) token in a config.
  2. The content provider can know who is trying to preview the content and it gives us a path to support fine-grained preview permissions.

My hope here would be that we can set a simple flag in our config:

passPreviewUserAuth: true

Which would prompt helix-admin to send the bearer token back to the content provider.

CC @andreituicu @tripodsan

@rofe
Copy link
Contributor

rofe commented Jan 22, 2025

@auniverseaway thanks for the additional details.

I'd like to have @tripodsan's opinion wether we can delegate this entirely to helix-admin. Somehow i feel that the sidekick shouldn't have to concern itself with that...

@auniverseaway
Copy link
Member Author

auniverseaway commented Jan 23, 2025

Somehow i feel that the sidekick shouldn't have to concern itself with that...

Agreed. Sending the header to helix admin is currently how XWalk is doing it, but I think for all parties, it would be better as a project config. If you think about all the surfaces we would have to chase down to add this header as it's currently implemented is a lot.

Thinking about this more, the right place to put this is probably in the fstab...

mountpoints:
  /:
    url: https://content.da.live/adobecom/da-bacom/
    type: markup
    auth: bearer

folders:
  /app/: /apps/custom/shell

@dylandepass
Copy link
Member

Since overlays will also need this, and overlays are not supported with fstab, it makes sense to place it in the config service. I would also say that we need to do it in a way that allows different tokens to be used for the primary content source and the overlay.

@tripodsan
Copy link
Contributor

the problem is that the admin doesn't store the token received during the admin login process. we would need to set it as additional cookie. eg:

  1. request to admin/login
  2. idp roundtrip -> idp token (eg IMS)
  3. admin validates token, checks access
    3.1 admin generates own token, exp. 24h, hands it to sidekick + sets same-site cookie
    3.2 new: admin also sets lax cookie x-hlx-idp-token, set expiration to token expiration
  4. requests to admin will include the authorization header of the helix token, and also the x-hlx-idp-token cookie
  5. if configured, it will be passed to the html2md as authorization header (which then will pass it to the content source)

the biggest problem i see is the potential mismatch of token lifetime, i.e. the sidekick is still "logged in" while the idp token already expired. which leads to weird problems. one solution could be to downgrade the admin token lifetime to the one of the idp token.

@auniverseaway
Copy link
Member Author

the problem is that the admin doesn't store the token received during the admin login process. we would need to set it as additional cookie.

That's what I was trying to avoid.

What I pictured:

  1. I login to my IDP (IMS, MS, Google) to any surface (DA, Sidekick, AEM)
  2. Hours go by (but not 24 hours)
  3. I send a request to helix admin with my header of choice (Authorization, X-Auth-Token)
  4. Helix Admin receives that token and validates that it is good.
  5. Helix Admin stores (in memory?) the token for potential later use.
  6. When Helix Admin sends the request to the content provider, it uses the fstab config to see if the source needs the same auth token.
  7. If it does, it passes the in-memory token it just received to the content provider.

@tripodsan
Copy link
Contributor

  1. If it does, it passes the in-memory token it just received to the content provider.

but what happens after the IDP token is expired?

@tripodsan
Copy link
Contributor

  1. I send a request to helix admin with my header of choice (Authorization, X-Auth-Token)
  2. Helix Admin receives that token and validates that it is good.

what is sending the request? curl? DA client? sidekick?

@andreituicu
Copy link
Collaborator

Just want to start by saying that as a principle I like the idea of using user tokens when accessing different content providers, but I think that it is not really simple, especially if it needs to be generic to acomodate all content providers.

Does this work with xWalk when you click preview in the sidekick (not in AEM itself)? does it send an IMS token?

Besides the problems that @tripodsan mentioned with the token expiration mismatch, there is also the problem of token scope mismatch. The IMS token that Helix has when logging into IMS does not have the same scopes as the scopes the DA token has. So even if Helix were to give that token to the sidekick/put in a cookie to use at preview time, it will not be enough for DA to assert the fine grained permissions. Calls to IMS where additional scopes like the one for retrieving groups would fail.

{"error":"invalid_scope","error_description":"Token does not have the read_organizations scope."}

For this to work, Helix would either need to keep in sync the scopes to the DA scopes and update them everytime it is needed, or have a separate client with DA scopes.

I see a number of options how this can be solved, but unfortunately none that I can say work in every scenario and are generic for overlays and different content sources.

1. Everytime you login into DA, DA makes a call to Helix to "register" a token for the current user and content provider. (IIUC this is what @auniverseaway is proposing, so that's why I'm starting with it).

This means there would be somewhere a file/number of files in Helix Storage (S3?) where there are user tokens, which is centralised. Yes, it has the advantage that an operation triggered from anywhere (DA, Sidekick, curl, Helix Cron, etc.) would be able to make use of that token, but it also means that getting access to those files, you get hold of a lot user access tokens.

2. Everytime you login into DA, DA makes a call to the sidekick to "register" a token for the current user + project + content provider.

This makes it so the user token remains in the user's browser (so not centralised) and would work for preview requests sent by the sidekick, because the sidekick can include both the Helix Token and the DA token.

It needs careful protection and handling on the sidekick side, so in multitenant environments customer code can't do the same call to inject tokens (e.g. if there can be customer code on the da.live origin).

The problem is that Helix internally triggered operations (like a scheduled publishing) would not work.

3. Helix sends the IMS token with lower scope/Helix token. DA validates it and obtains the profile with an internal service token.

Helix can send a token that it has, either the IMS token with a lower scope or even the Helix token. This is enough for DA to validate extract the user_id claim from one of them, which will correspond to the IMS claim.

DA can have an internal more powerful service token that can obtain any user profile information (like groups) directly from IMS, based on that user id.

The problem is that I don't know how open IMS is to approve the usage of such a token, especially outside Adobe's own infra, because it needs special approval.

4. Sidekick knows to obtain directly from IMS a token with the scopes that DA needs by doing a login flow

5. Helix knows to obtain directly from IMS a token with scopes that DA needs by doing a login flow

I think there might be more patterns, but the point remains that a random IMS token may not be enough for each content source to do its job with it. It depends on where does it send it later and which scopes are in it.

And here we are just talking about 1 content source (DA), without trying to make it generic to work with all overlays.

Maybe to add another point: It is great that the content source can make fine grained permissions assertions, but since Helix permissions are global. A user might not be able to read in DA, as long as they have access to the site, they can still see the page previewed by someone who did have those permissions, so there can be some weird situations like this.

I have been discussing these options with different people from the security team, but I think the best person to check with is Jose and he's on PTO for the moment. I'd love to get his view too before trying to start an implementation.

Sorry for the wall of text!

@auniverseaway
Copy link
Member Author

but what happens after the IDP token is expired?

If you accepted it, it's not expired. The one issue could be long running tasks. You get the token when the job is kicked off, but if it takes a long time... we have issues.

what is sending the request? curl? DA client? sidekick?

Yes. All the above. If I request something of admin, I'm always sending a token... regardless of my client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants