Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(webhook): Safer defaults and more config for webhook URLs #4333

Merged
merged 6 commits into from
Sep 27, 2024

Conversation

jcavanagh
Copy link
Contributor

@jcavanagh jcavanagh commented Nov 11, 2022

fix(webhook): Safer defaults and more config for webhook URLs

Exclude by default: single-word hostnames (https://orca, https://spin-orca), the .spinnaker domain (a common k8s deployment namespace), common internal-name suffixes (.local, .internal, .localdomain), and all verbatim IP addresses.

Add new configuration to specify a list of exclusion patterns. This greatly simplifies configuration, as it is not easy to do complex filtering in a single allow expression.

Add new configuration to dynamically exclude domains based on the values of specified environment variables. For example, this can always exclude the k8s namespace Spinnaker is currently running in, long as there is some variable set that specifies what that is. POD_NAMESPACE is commonly set by providers, and is included by default along with ISTIO_META_MESH_ID, as names in that domain are also resolvable.

Also allows localhost in all cases if the rejectLocalhost flag is false, disregarding the name filter. This avoids the need to change the name filter to include all forms of local names while developing.

Copy link
Contributor Author

@jcavanagh jcavanagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect any breakage from these default filters, since it would be pretty unusual to be using a name or address within these patterns.

Feedback on additional default exclusions is welcome!


public UserConfiguredUrlRestrictions(
protected UserConfiguredUrlRestrictions(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The visibility of this constructor was reduced to mandate the use of the Builder

@@ -144,37 +237,42 @@ public URI validateURI(String url) throws IllegalArgumentException {
"Allowed Hostnames are not set, external HTTP requests are not enabled. Please configure 'user-configured-url-restrictions.allowedHostnamesRegex' in your orca config.");
}

if (!allowedHostnames.matcher(host).matches()) {
throw new IllegalArgumentException(
"Host not allowed " + host + ". Host much match " + allowedHostnames.toString() + ".");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of these messages were trimmed down, since there is no need to expose that configuration.

@@ -5,16 +5,21 @@ import spock.lang.Unroll

class UserConfiguredUrlRestrictionsSpec extends Specification {

// Don't try to actually resolve hosts, and control the result of determining whether a host is localhost or link local
def spyOn(UserConfiguredUrlRestrictions subject, isLocalhost = false, isLinkLocal = false) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the builder arguments in these tests were to dodge code paths that attempted to resolve names, so that has been mocked out and the non-relevant builder arguments removed from the below tests. Those code paths are now more exercised than before.

'https://192.168.0.1',
'http://172.16.0.1',
'http://10.0.0.1',
'https://fd12:3456:789a:1::1',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ipv6 standard is pretty weird, since the : is rather overloaded - in any case, the code within can now exclude them in both forms.

@jasonmcintosh
Copy link
Member

Note assuming merge... we'll want to call out this change in release notes, as it'd cause any hook to an internal call in another namespace to likely break. Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.

@jcavanagh
Copy link
Contributor Author

as it'd cause any hook to an internal call in another namespace to likely break

Other k8s namespaces should work just fine, so long as they are not named spinnaker, local or internal.

Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.

Unfortunately, specifying private IPs verbatim in the webhook bypasses domain filtering, and could allow resolution of internal Spinnaker services. Alternatively, we could just disallow any literal IP address in the webhook configuration.

Copy link
Member

@jasonmcintosh jasonmcintosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SO after further looking - this breaks the IP restriction where if a "PASS" host lands in a "REJECT" Ip range, the request would fail. This PR makes that no longer happen :(

@jasonmcintosh
Copy link
Member

@dbyron-sf or @kskewes-sf FYI on this one...

@jcavanagh jcavanagh force-pushed the upstream-webhook-mitigations branch 2 times, most recently from f1a47b1 to d0b44da Compare November 11, 2022 22:17
@jcavanagh
Copy link
Contributor Author

jcavanagh commented Nov 11, 2022

@jasonmcintosh I've made the following changes to the implementation:

  • Verbatim IP addresses are now rejected by default
  • IP filtering is again considered against the resolved address
  • No IP ranges are excluded by default

Trying to filter internal requests by IP is an unwinnable game of whack-a-mole. I'll give some thought to how we might recharacterize user-initiated requests in general, so that they can be rejected by internal APIs without any special configuration.

@dbyron-sf
Copy link
Contributor

In general I support this. I'm a little hesitant because I get the feeling it changes existing behavior. Could you @jcavanagh perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?

// Useful for dynamically excluding all webhook requests to the current k8s namespace, for
// example
private List<String> excludedDomainsFromEnvironment =
List.of("POD_NAMESPACE", "ISTIO_META_MESH_ID");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reflex is to leave these defaults out...I guess they're no-ops for people outside of k8s / not using istio...but somebody somewhere is gonna have these set and we're gonna mess them up. If they do stay in, I'd love links to docs about them in comments.

@dbyron-sf
Copy link
Contributor

The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?

@jcavanagh
Copy link
Contributor Author

The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?

Yes, that's correct. This is not a complete fix, but does enhance the ability of operators to mitigate the most probable vector (Webhook stages) specifically. The main issues here are toning down the flexibility of k8s DNS, and disallowing exhaustive attacks within a generally-constrained IP space.

perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?

I'm not opposed to that - we could move those env-var values to the release notes as recommendations instead.

Disallowing any IP address by default could certainly be a breaking change, but I don't think there's any real justification for such a configuration. If an operator wishes, the option can be disabled.

Exclude by default: single-word hostnames (`https://orca`, `https://spin-orca`), the `.spinnaker` domain (a common k8s deployment namespace), common internal-name suffixes (`.local`, `.internal`, `.localdomain`), and all verbatim IP addresses.

Add new configuration to specify a list of exclusion patterns.  This greatly simplifies configuration, as it is not easy to do complex filtering in a single allow expression.

Add new configuration to dynamically exclude domains based on the values of specified environment variables.  For example, this can always exclude the k8s namespace Spinnaker is currently running in, long as there is some variable set that specifies what that is.  `POD_NAMESPACE` is commonly set by providers, and is included by default along with `ISTIO_META_MESH_ID`, as names in that domain are also resolvable.

Also allows `localhost` in all cases if the `rejectLocalhost` flag is `false`, disregarding the name filter.  This avoids the need to change the name filter to include all forms of local names while developing.
@jasonmcintosh
Copy link
Member

ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.

The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.

@jcavanagh
Copy link
Contributor Author

ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.

When is "originally"? This change was never a deny-all implementation, nor is that something that I think operators would want. I might be missing some context here.

The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.

I think you may have missed the most recent changes - see my other comment here.

@jasonmcintosh
Copy link
Member

#4214 was a PR to discuss changing the IP/DNS restrictions. However, the "Break it and require overrides to be set" was backed out to prevent a breaking change. Via commit: 4e25319#diff-6d7eea546c2be1a74db8daa4f32860c679584686c8cc7cd1c3f6fe6a199f8918R40

@dbyron-sf
Copy link
Contributor

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Dec 13, 2022

update

✅ Branch has been successfully updated

Copy link
Member

@jasonmcintosh jasonmcintosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SO conversing with the various teams, I don't see rejectVerbatimIps being remotely useful since you can use various domains to get any IP you want. E.g. http://1u.ms/ kinda stuff. That said, will go ahead let this go through to get at least a little better.

@dbyron-sf dbyron-sf added the ready to merge Approved and ready for merge label Sep 27, 2024
@mergify mergify bot added the auto merged Merged automatically by a bot label Sep 27, 2024
@dbyron-sf
Copy link
Contributor

@Mergifyio update

Copy link
Contributor

mergify bot commented Sep 27, 2024

update

☑️ Nothing to do

  • #commits-behind > 0 [📌 update requirement]
  • -closed [📌 update requirement]
  • -conflict [📌 update requirement]
  • queue-position = -1 [📌 update requirement]

@mergify mergify bot merged commit 3e98d0a into spinnaker:master Sep 27, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto merged Merged automatically by a bot ready to merge Approved and ready for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants