-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BackendTLSPolicy's Service attachment is problematic #3554
Comments
Thanks for writing this down, @howardjohn, it does make sense. Do you have any suggestions here? How is DestinationRule different that it doesn't have these problems? Is it just that it is effectively at the equivalent of the HTTPRoute level? |
DestinationRule actually sort of has the same problem but has an escape hatch. You can provide the TLS config in the top level to make it apply to the service generally, which has the possibility of the same issue, but you can also scope it to a subset and then route directly to that: apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
name: originate-tls
spec:
host: foo
subsets:
- name: originate-tls
tls:
mode: SIMPLE
sni: foo.com
---
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: originate-tls
spec:
hosts:
- foo.com
http:
- route:
- destination:
host: foo
subset: originate-tls Note: "subset" is usually used to filter a service into a smaller set; here its used in a less common way where the set of endpoints is all of them, and we just customizing the configuration. One way you could actually do the same today is to make two services ( Another way would be to allow the BackendTLSPolicy to be scoped to a specific route in some way, rather than the Service directly. Like maybe: backendRefs:
filters: # "filter" name is awkward here...
- extensionRef:
kind: BackendTlsPolicy
name: foo-policy
name: foo but then we have the same policy be used as an extensionRef and targetRef which is probably not good. You could targetRef the HTTPRoute (and further scope with sectionName?), but its still not quite the same since you cannot say "attach it to this backend in this rule in this route" |
This is pretty close to what I was trying to achieve with a |
@howardjohn I'm not sure I understand the problem here.
Does the service mesh also add TLS to the request that targets a route with edge termination? If not, can't you use that same treatment to the backend target? If not, is it possible to reuse SectionName within the |
yes the mesh would have to use it since it's bound to the service - so anything calling the service should use it. There is nothing saying "X gateway should use it". Using an alternate port is a very rough UX for users to expose an additional port on their application solely to work around gateway API limits |
Isn't it already clear that we don't want to use BackendTLSPolicy for services in a mesh, since they are generally automatically configured for mTLS? Could we go a step further and say we don't serve egress from a service mesh via BackendTLSPolicy? Or once the traffic comes off the mesh, don't use the mTLS (or other tls config that was pertinent within the mesh)? |
No, the two have zero correlation. Whether any application uses HTTPS is a property of the application. It's rare that users modify their applications to service mesh (or they can never, as they are third party). It's extremely common to have application TLS over mesh TLS |
Coming back to this, I think the problem is here. The API semantic I thought we were implying here is "The app that's running inside the containers inside the Pods this service is referencing expects TLS connections to arrive at the process running inside the Pod". So whether or not the mesh does transparent encryption is not important, as the whole intent is allow the application owner to say "the thing that's running in my pods expects TLS on this port". It's not a request to ensure that the traffic is protected with TLS as a way for an application owner to inform the Gateway implementation that non-TLS requests won't succeed. |
In theory I think its a good a good semantic and aligns well with the personas. But that doesn't really solve the issues raised here -- that this cannot actually be properly implemented. For example, you might try to derive "Gateway is TLS PASSTHROUGH, so its already TLS, so I can ignore |
I agree that those problems are not solved by this, but I also don't really see how they are solvable at all tbh. I understand you have a solution with DestinationRule in Istio, so if you can come up with something that will work better here, I'd love to hear it. |
Conceptually the solution is simple - allow the client or chose to chose the TLS policy not (just) the server. @mikemorris is dealing with the same issues with retry budgets. How we want to expose that in the API is debatable - lots of plausible options. Istio does it by allowing you to specify it in the route backend ref (more or less) |
So, the idea here is that if you specify some TLS policy in the backendRef then it will take precedence? Practically, I don't see how that will work in cases where, for example, the app owner says "my app requires TLS on this port" and then the backendRef says something different. Seems like it will always need to be a negotiation or a last-common-denominator style calculation, which may mean that TLS connections can't proceed - like if there's some policy that says "only TLS 1.3", and the BackendTLSPolicy says "only use TLS 1.1". |
You probably also need to be able to declare the lack of a TLS policy. Like if I'm in a tlsroute doing passthrough that is probably what I want . In Istio how it works is the "BackendPolicy" type (destination rule) is really a combination of targeting a Service but also applying to specific clients. This has (like GAMMA routes) consumer-specific revisions. On top of these, each policy can have "subsets" (this is a terrible name, for the purpose here pretend they were named "variants") that can be routed to. So I can have a DR with variant "TLS" and "plaintext" (adding TLS vs not adding it),and then explicitly select a variant from a route. You don't necessarily need both of those to achieve the goal though |
It feels like using a TLSRoute with Passthrough should imply that routing to a Service that also has a BackendTLSPolicy can probably ignore that BackendTLSPolicy by default, since the BackendTLSPolicy is saying "connections here must be TLS", and so is the TLS Passthrough... I understand and can see the value of the variant config for more complex use cases, but I think that it's better to leave that to workarounds like separate Services or implementationSpecific things in upstream. |
FWIW this problem is going to come up with every single "backend" policy. And it seems 2025 is the year of backend policies - TLS, load balancing, retry budgets, and I am sure more to come. |
I think the phrasing here is getting mixed up again a bit, and you were more precise earlier @youngnick with:
I do agree that a BackendTLSPolicy is likely not applicable for Passthrough connections though, because in those cases the client (which may be external to the cluster and unaware of BackendTLSPolicy configuration) is responsible for configuring TLS settings as needed to reach the application rather than a Gateway. |
These comments were made during the initial design, but I wanted to persist them into an issue.
BackendTLSPolicy currently attaches to a Service.
In the simple case, this works fine. For example, I may have:
and I will terminate TLS and encrypt it on the upstream. This works great.
Where things fall apart is when we have multiple paths.
For example, imagine I have the above config, but then also have a service mesh. An application in the service mesh calls
curl https://foo
.The
foo
service has the TLS config saying "Add TLS to this request". The service mesh adds TLS and now it has 2 layers of TLS which is clearly broken.If the API semantic is supposed to be "make sure this request is TLS" not "Add TLS to the this request" (which I don't think it is, nor do I think it should be, just covering the bases here) that is infeasible to implement since it would require any implementation to know whether a request is already TLS encrypted or not.
This same issue can occur without mesh as well. For example, if in addition to my configs above I have:
then I have the same problem of TLS being added again. Again, we could say the impl should detect this is already TLS and it should not be added. However, this falls apart in other cases:
As it this is opaque TCP, we are not aware if it is TLS or not.
Overall, these issues prevent BackendTLSPolicy from being a viable replacement for DestinationRule for Istio
The text was updated successfully, but these errors were encountered: