Skip to content

Commit 82cb691

Browse files
committed
copy md files to static folder
1 parent 7b3f6a5 commit 82cb691

File tree

82 files changed

+17743
-4
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+17743
-4
lines changed
Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# Managing Multi-Node Clusters with Embedded Cluster
2+
3+
This topic describes managing nodes in clusters created with Replicated Embedded Cluster, including how to add nodes and enable high-availability for multi-node clusters.
4+
5+
## Limitations
6+
7+
Multi-node clusters with Embedded Cluster have the following limitations:
8+
9+
* Support for multi-node clusters with Embedded Cluster is Beta. Only single-node embedded clusters are Generally Available (GA).
10+
11+
* High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]).
12+
13+
* The same Embedded Cluster data directory used at installation is used for all nodes joined to the cluster. This is either the default `/var/lib/embedded-cluster` directory or the directory set with the [`--data-dir`](/reference/embedded-cluster-install#flags) flag. You cannot choose a different data directory for Embedded Cluster when joining nodes.
14+
15+
* More than one controller node should not be joined at the same time. When joining a controller node, a warning is printed that explains that the user should not attempt to join another node until the controller node joins successfully.
16+
17+
## Add Nodes to a Cluster (Beta) {#add-nodes}
18+
19+
You can add nodes to create a multi-node cluster in online (internet-connected) and air-gapped (limited or no outbound internet access) environments. The Admin Console provides the join command that you use to join nodes to the cluster.
20+
21+
:::note
22+
Multi-node clusters are not highly available by default. For information about enabling high availability, see [Enable High Availability for Multi-Node Clusters (Alpha)](#ha) below.
23+
:::
24+
25+
To add nodes to a cluster:
26+
27+
1. (Optional) In the Embedded Cluster Config, configure the `roles` key to customize node roles. For more information, see [roles](/reference/embedded-config#roles) in _Embedded Cluster Config_. When you are done, create and promote a new release with the updated Config.
28+
29+
1. Do one of the following to get the join command from the Admin Console:
30+
31+
1. To add nodes during the application installation process, follow the steps in [Online Installation with Embedded Cluster](/enterprise/installing-embedded) or [Air Gap Installation with Embedded Cluster](/enterprise/installing-embedded-air-gap) to install. A **Nodes** screen is displayed as part of the installation flow in the Admin Console that allows you to choose a node role and copy the relevant join command.
32+
33+
1. Otherwise, if you have already installed the application:
34+
35+
1. Log in to the Admin Console.
36+
37+
1. If you promoted a new release that configures the `roles` key in the Embedded Cluster Config, update the instance to the new version. See [Performing Updates in Embedded Clusters](/enterprise/updating-embedded).
38+
39+
1. Go to **Cluster Management > Add node** at the top of the page.
40+
41+
<img alt="Add node page in the Admin Console" src="/images/admin-console-add-node.png" width="600px"/>
42+
43+
[View a larger version of this image](/images/admin-console-add-node.png)
44+
45+
1. Either on the Admin Console **Nodes** screen that is displayed during installation or in the **Add a Node** dialog, select one or more roles for the new node that you will join. Copy the join command.
46+
47+
Note the following:
48+
49+
* If the Embedded Cluster Config [roles](/reference/embedded-config#roles) key is not configured, all new nodes joined to the cluster are assigned the `controller` role by default. The `controller` role designates nodes that run the Kubernetes control plane. Controller nodes can also run other workloads, such as application or Replicated KOTS workloads.
50+
51+
* Roles are not updated or changed after a node is added. If you need to change a node’s role, reset the node and add it again with the new role.
52+
53+
* For multi-node clusters with high availability (HA), at least three `controller` nodes are required. You can assign both the `controller` role and one or more `custom` roles to the same node. For more information about creating HA clusters with Embedded Cluster, see [Enable High Availability for Multi-Node Clusters (Alpha)](#ha) below.
54+
55+
* To add non-controller or _worker_ nodes that do not run the Kubernetes control plane, select one or more `custom` roles for the node and deselect the `controller` role.
56+
57+
1. Do one of the following to make the Embedded Cluster installation assets available on the machine that you will join to the cluster:
58+
59+
* **For online (internet-connected) installations**: SSH onto the machine that you will join. Then, use the same commands that you ran during installation to download and untar the Embedded Cluster installation assets on the machine. See [Online Installation with Embedded Cluster](/enterprise/installing-embedded).
60+
61+
* **For air gap installations with limited or no outbound internet access**: On a machine that has internet access, download the Embedded Cluster installation assets (including the air gap bundle) using the same command that you ran during installation. See [Air Gap Installation with Embedded Cluster](/enterprise/installing-embedded-air-gap). Then, move the downloaded assets to the air-gapped machine that you will join, and untar.
62+
63+
:::important
64+
The Embedded Cluster installation assets on each node must all be the same version. If you use a different version than what is installed elsewhere in the cluster, the cluster will not be stable. To download a specific version of the Embedded Cluster assets, select a version in the **Embedded cluster install instructions** dialog.
65+
:::
66+
67+
1. On the machine that you will join to the cluster, run the join command that you copied from the Admin Console.
68+
69+
**Example:**
70+
71+
```bash
72+
sudo ./APP_SLUG join 10.128.0.32:30000 TxXboDstBAamXaPdleSK7Lid
73+
```
74+
**Air Gap Example:**
75+
76+
```bash
77+
sudo ./APP_SLUG join --airgap-bundle APP_SLUG.airgap 10.128.0.32:30000 TxXboDstBAamXaPdleSK7Lid
78+
```
79+
80+
1. In the Admin Console, either on the installation **Nodes** screen or on the **Cluster Management** page, verify that the node appears. Wait for the node's status to change to Ready.
81+
82+
1. Repeat these steps for each node you want to add.
83+
84+
## Enable High Availability for Multi-Node Clusters (Alpha) {#ha}
85+
86+
Multi-node clusters are not highly available by default. The first node of the cluster is special and holds important data for Kubernetes and KOTS, such that the loss of this node would be catastrophic for the cluster. Enabling high availability (HA) requires that at least three controller nodes are present in the cluster. Users can enable HA when joining the third node.
87+
88+
:::important
89+
High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]).
90+
:::
91+
92+
### HA Architecture
93+
94+
The following diagram shows the architecture of an HA multi-node Embedded Cluster installation:
95+
96+
![Embedded Cluster multi-node architecture with high availability](/images/embedded-architecture-multi-node-ha.png)
97+
98+
[View a larger version of this image](/images/embedded-architecture-multi-node-ha.png)
99+
100+
As shown in the diagram above, in HA installations with Embedded Cluster:
101+
* A single replica of the Embedded Cluster Operator is deployed and runs on a controller node.
102+
* A single replica of the KOTS Admin Console is deployed and runs on a controller node.
103+
* Three replicas of rqlite are deployed in the kotsadm namespace. Rqlite is used by KOTS to store information such as support bundles, version history, application metadata, and other small amounts of data needed to manage the application.
104+
* For installations that include disaster recovery, the Velero pod is deployed on one node. The Velero Node Agent runs on each node in the cluster. The Node Agent is a Kubernetes DaemonSet that performs backup and restore tasks such as creating snapshots and transferring data during restores.
105+
* For air gap installations, two replicas of the air gap image registry are deployed.
106+
107+
Any Helm [`extensions`](/reference/embedded-config#extensions) that you include in the Embedded Cluster Config are installed in the cluster depending on the given chart and whether or not it is configured to be deployed with high availability.
108+
109+
For more information about the Embedded Cluster built-in extensions, see [Built-In Extensions](/vendor/embedded-overview#built-in-extensions) in _Embedded Cluster Overview_.
110+
111+
### Requirements
112+
113+
Enabling high availability has the following requirements:
114+
115+
* High availability is supported with Embedded Cluster 1.4.1 or later.
116+
117+
* High availability is supported only for clusters where at least three nodes with the `controller` role are present.
118+
119+
### Limitations
120+
121+
Enabling high availability has the following limitations:
122+
123+
* High availability for Embedded Cluster in an Alpha feature. This feature is subject to change, including breaking changes. For more information about this feature, reach out to Alex Parker at [[email protected]](mailto:[email protected]).
124+
125+
* The `--enable-ha` flag serves as a feature flag during the Alpha phase. In the future, the prompt about migrating to high availability will display automatically if the cluster is not yet HA and you are adding the third or more controller node.
126+
127+
* HA multi-node clusters use rqlite to store support bundles up to 100 MB in size. Bundles over 100 MB can cause rqlite to crash and restart.
128+
129+
### Best Practices for High Availability
130+
131+
Consider the following best practices and recommendations for creating HA clusters:
132+
133+
* At least three _controller_ nodes that run the Kubernetes control plane are required for HA. This is because clusters use a quorum system, in which more than half the nodes must be up and reachable. In clusters with three controller nodes, the Kubernetes control plane can continue to operate if one node fails because a quorum can still be reached by the remaining two nodes. By default, with Embedded Cluster, all new nodes added to a cluster are controller nodes. For information about customizing the `controller` node role, see [roles](/reference/embedded-config#roles) in _Embedded Cluster Config_.
134+
135+
* Always use an odd number of controller nodes in HA clusters. Using an odd number of controller nodes ensures that the cluster can make decisions efficiently with quorum calculations. Clusters with an odd number of controller nodes also avoid split-brain scenarios where the cluster runs as two, independent groups of nodes, resulting in inconsistencies and conflicts.
136+
137+
* You can have any number of _worker_ nodes in HA clusters. Worker nodes do not run the Kubernetes control plane, but can run workloads such as application or Replicated KOTS workloads.
138+
139+
### Create a Multi-Node HA Cluster
140+
141+
To create a multi-node HA cluster:
142+
143+
1. Set up a cluster with at least two controller nodes. You can do an online (internet-connected) or air gap installation. For more information, see [Online Installation with Embedded Cluster](/enterprise/installing-embedded) or [Air Gap Installation with Embedded Cluster](/enterprise/installing-embedded-air-gap).
144+
145+
1. SSH onto a third node that you want to join to the cluster as a controller.
146+
147+
1. Run the join command provided in the Admin Console **Cluster Management** tab and pass the `--enable-ha` flag. For example:
148+
149+
```bash
150+
sudo ./APP_SLUG join --enable-ha 10.128.0.80:30000 tI13KUWITdIerfdMcWTA4Hpf
151+
```
152+
153+
1. After the third node joins the cluster, type `y` in response to the prompt asking if you want to enable high availability.
154+
155+
![high availability command line prompt](/images/embedded-cluster-ha-prompt.png)
156+
[View a larger version of this image](/images/embedded-cluster-ha-prompt.png)
157+
158+
1. Wait for the migration to complete.

0 commit comments

Comments
 (0)