-
Notifications
You must be signed in to change notification settings - Fork 4
Future Challenges
The goal of building a globally distributed Kubernetes cluster will present us with a great deal of challenges. Not all such challenges will be immediately pertinent to the stage of the project which we are actively focused on. This page serves as a brainstorming ground for potential problems which we may want to think about, but won't be faced until the mid-to-distant future.
The default ansible playbook deployment includes several performance monitoring and log collection add-ons which run in the kube-system namespace. While these services give good insights into how to operate such addons throughout the cluster, they are not configured to operate in a manner particularly suited to our purposes. This section describes a set of services to be stood up in their place.
So long as the docker engine has the json-file logging driver enabled, kubelet will automatically create symlinks for each container JSON log on the host under /var/log/containers. This puts the container logs under a single directory where they can be easily collected.
The included addons use fluentd to collect these logs, which is a good option. The cluster should run a fluentd DaemonSet whose only role is to collect container logs and ship them to a fluentd aggregator service.
The fluentd aggregator service should receive shipped log data and in turn forward it to an elasticsearch instance to be persisted. For simplicity's sake, fluentd should ship to elasticsearch using the logstash format.
On the elasticsearch instance, a daily curator job should be run to clean up old indexes, and to create per-project filtered aliases against each new or existing index to provide projects access only to the documents they will need.
Log data may be consumed by users through a Grafana interface, in which each filtered alias can be provided as a data source to an organization, members of which can be allowed to freely create and destroy dashboards and analyses against those sources.
Per-node metrics can be provided through running a privileged netdata container as a DaemonSet. These containers can be configured to ship metrics to the fluentd aggregator service for eventual persistence in an elasticsearch index (ideally logstash style named indexes as well). These metrics would the be viewable via dashboards in Grafana
Per-application metrics can be provided by shipping directly from the application containers to the fluentd aggregator service for eventual persistence in an elasticsearch index. Some thought still needs to be given to whether each application would be given its own index, or whether all application metrics should be stored in the same index, with per-project access being given by way of filtered aliases. Also, should it be undesirable for give application containers the ability to connect directly to the fluentd shipper, a separate fluentd instance or something like a statsd service could be setup as an intermediary. These metrics could be added to the per-project organization data sources in Grafana
We ultimately want a overlay network that supports both network policy and node-to-node TLS. As of September 2016, 3 such backends support network policy.
- Romana
- Calico
- Canal
Amongst these 3:
- Romana DOES NOT APPEAR to support TLS
- Calico DOES NOT support TLS
- Canal is a special snoflake; not a unique overlay solution itself, it is simply a deployment pattern for using the network policy capability of Calico with the transport capability of flannel. This should provide inter-node TLS and policy support, but leaves us to figure out the question of etcd communication
A known backend that DOES support TLS (and is the default backend configured by the kubernetes contrib ansible playbooks) is Flannel, though is DOES NOT support network policy
In an eventual cluster where compute resources can be donated from anywhere, we will inevitably need to address the problem of ensuring nodes can join the cluster from behind NAT gateways. Some deeper research is needed into Kubernetes architecture to identify the potential problem points, and possible solutions.
One of the forward-looking challenges with the leading networked storage solutions (namely Ceph, NFS) is that are designed only to solve the problems in their to their problem domain, i.e., storage. Securing networked communication between the storage backends and their clients is not part of that problem domain.
- So far, the only thing I can think of is we would need some sort
of node -> node overlay network which handles encrypting packets.
- This would inevitably translate to poor throughput. Potentially a very bad look for storage backends.