You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the agent, we have Node and Workload Attestors.
Today we kind of conflate two different things. The current TrustBundle used for the TrustDomain, and Server Attestation.
This creates a problem for sysadmins.
Sysamins don't generally want to maintain CAs. One of the great features of SPIRE over something like ACME, is that it manages very short lived CAs in addition to short lived Certificates. So its possible to make the argument to sysadmins, they dont have to worry about managing CAs, as SPIRE is just doing it. The CAs are just an implementation detail.
Machines go down unexpectedly. If they are to recover without a lot of painful effort from a sysadmin, today they must come up within the validity window of the TrustBundle at the time they went down. But conflating the lifecycle of the spire-server's certs and Server Attestation with the TrustDomain's CA's means the sysadmin has a tough choice to make:
A. Make the TrustDomain's CA TTL long enough to allow automatic recovery of machines that are down for a while (maybe 1 month+ ttl?) and now the sysadmin has to worry about all the negative affects of having a longish lived CA
B. Live with the Sword of Damocles over their heads that a power outage or other issue at the wrong time/place could make huge amounts of work for them.
But this doesn't have to be this way. If we had a way to provide for a new type of Attestor, ServerAttestor, we could decouple the ServerAttestation process from the Workload's Trust and allow things like longer lived CA's used for ServerAttestation, and still have very short lived CA's for workloads.
This would include making 2 changes to the SPIRE Agent.
Add a mechanism for having the agent to bootstrap again if it connects to the spire-server and its x509 cert no longer matches the cached trust bundle
Some kind of plugable mechanism to get a trust bundle for the spire-server, to attest/reattest and fetch the current workload trust bundle
For the user, it may make sense for a new plugin type, say, ServerAttestor to be created for Server Attestation.
An initial exploratory stab at that was made here: spiffe/spire-plugin-sdk#58 and some discussion in this issue: #5881
Externalizing the attestor via pluins would allow multiple ways of attesting along with more advanced policy based plugins to be written, but to keep this policy out of the main spire-server so it can be easily extended by the end user.
There's levels of safety in the spire system when checking the spire-server's cert:
server cert in the established trust bundle - very very unlikely there is a security issue
bootstrapping - happens very infrequently, and often with a sysadmin involved in the process, so there is actively looking for funny business.
rebootstrapping - happens potentially at any time. harder to protect against badness. Kind of up to each organization to decide the tradeoff between system unavailability and risk of recovery from a compromised server.
As an alternate to plugins, the mechanism could be outsourced via the existing trust_bundle_url mechanism if unix socket support was added and some additional metadata passed. That option is explored a bit here: #5932
This approach has some drawbacks:
it would have a separate lifecycle from the rest of the system. plugins fork off with the agent, so are a bit easier to manage that way
we have plugins for everything else. its odd to use a completely different mechanism for this type of extension. This will lead to user confusion
configuration for it would be completely different/not stored in spire-agent config like everything else
for policy based plugins, if there is a formal plugin api for ServerAttestors, it can call out to other ServerAttestor plugins to share code.
The text was updated successfully, but these errors were encountered:
In the agent, we have Node and Workload Attestors.
Today we kind of conflate two different things. The current TrustBundle used for the TrustDomain, and Server Attestation.
This creates a problem for sysadmins.
Sysamins don't generally want to maintain CAs. One of the great features of SPIRE over something like ACME, is that it manages very short lived CAs in addition to short lived Certificates. So its possible to make the argument to sysadmins, they dont have to worry about managing CAs, as SPIRE is just doing it. The CAs are just an implementation detail.
Machines go down unexpectedly. If they are to recover without a lot of painful effort from a sysadmin, today they must come up within the validity window of the TrustBundle at the time they went down. But conflating the lifecycle of the spire-server's certs and Server Attestation with the TrustDomain's CA's means the sysadmin has a tough choice to make:
A. Make the TrustDomain's CA TTL long enough to allow automatic recovery of machines that are down for a while (maybe 1 month+ ttl?) and now the sysadmin has to worry about all the negative affects of having a longish lived CA
B. Live with the Sword of Damocles over their heads that a power outage or other issue at the wrong time/place could make huge amounts of work for them.
But this doesn't have to be this way. If we had a way to provide for a new type of Attestor, ServerAttestor, we could decouple the ServerAttestation process from the Workload's Trust and allow things like longer lived CA's used for ServerAttestation, and still have very short lived CA's for workloads.
This would include making 2 changes to the SPIRE Agent.
A first stab at #1 is being done here: #5892
For the user, it may make sense for a new plugin type, say, ServerAttestor to be created for Server Attestation.
An initial exploratory stab at that was made here: spiffe/spire-plugin-sdk#58 and some discussion in this issue: #5881
Externalizing the attestor via pluins would allow multiple ways of attesting along with more advanced policy based plugins to be written, but to keep this policy out of the main spire-server so it can be easily extended by the end user.
There's levels of safety in the spire system when checking the spire-server's cert:
As an alternate to plugins, the mechanism could be outsourced via the existing trust_bundle_url mechanism if unix socket support was added and some additional metadata passed. That option is explored a bit here: #5932
This approach has some drawbacks:
The text was updated successfully, but these errors were encountered: