From dde020734ddaff962e8ec5db375a125ed3d7ca1d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:39:48 +0000 Subject: [PATCH 01/25] Initial plan Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> From 9c17c6d89da87785a8215604f415bd57b69b41b4 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:47:07 +0000 Subject: [PATCH 02/25] Replace redis-rb-cluster examples with Valkey GLIDE Node.js examples Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 179 +++++++++++++++++++++---------------- 1 file changed, 102 insertions(+), 77 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index f8bc909ca..d487b2892 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -193,7 +193,7 @@ To create and use a Valkey Cluster, follow these steps: * [Create a Valkey Cluster](#create-a-valkey-cluster) * [Interact with the cluster](#interact-with-the-cluster) -* [Write an example app with redis-rb-cluster](#write-an-example-app-with-redis-rb-cluster) +* [Write an example app with Valkey GLIDE](#write-an-example-app-with-valkey-glide) * [Reshard the cluster](#reshard-the-cluster) * [A more interesting example application](#a-more-interesting-example-application) * [Test the failover](#test-the-failover) @@ -350,7 +350,7 @@ right node. The map is refreshed only when something changed in the cluster configuration, for example after a failover or after the system administrator changed the cluster layout by adding or removing nodes. -#### Write an example app with redis-rb-cluster +#### Write an example app with Valkey GLIDE Before going forward showing how to operate the Valkey Cluster, doing things like a failover, or a resharding, we need to create some example application @@ -363,91 +363,114 @@ world conditions. It is not very helpful to see what happens while nobody is writing to the cluster. This section explains some basic usage of -[redis-rb-cluster](https://github.com/antirez/redis-rb-cluster) showing two -examples. -The first is the following, and is the -[`example.rb`](https://github.com/antirez/redis-rb-cluster/blob/master/example.rb) -file inside the redis-rb-cluster distribution: - -```ruby -require './cluster' - -if ARGV.length != 2 - startup_nodes = [ - {:host => "127.0.0.1", :port => 7000}, - {:host => "127.0.0.1", :port => 7001} - ] -else - startup_nodes = [ - {:host => ARGV[0], :port => ARGV[1].to_i} - ] -end - -rc = RedisCluster.new(startup_nodes,32,:timeout => 0.1) - -last = false - -while not last - begin - last = rc.get("__last__") - last = 0 if !last - rescue => e - puts "error #{e.to_s}" - sleep 1 - end -end - -((last.to_i+1)..1000000000).each{|x| - begin - rc.set("foo#{x}",x) - puts rc.get("foo#{x}") - rc.set("__last__",x) - rescue => e - puts "error #{e.to_s}" - end - sleep 0.1 +[Valkey GLIDE for Node.js](https://github.com/valkey-io/valkey-glide/tree/main/node), the official +Valkey client library, showing a simple example application. + +The following example demonstrates how to connect to a Valkey cluster and perform +basic operations. First, install the Valkey GLIDE client: + +```bash +npm install @valkey/valkey-glide +``` + +Here's the example code: + +```javascript +import { GlideClusterClient, ConnectionError } from "@valkey/valkey-glide"; + +async function runExample() { + // Define startup nodes - you only need one reachable node + let startupNodes = [ + { host: "127.0.0.1", port: 7000 }, + { host: "127.0.0.1", port: 7001 } + ]; + + // Handle command line arguments + if (process.argv.length === 4) { + const [,, host, port] = process.argv; + startupNodes = [{ host, port: parseInt(port) }]; + } + + let client; + + try { + // Create cluster client + client = await GlideClusterClient.createClient({ + addresses: startupNodes, + clientConfiguration: { + requestTimeout: 100 + } + }); + + console.log("Connected to Valkey cluster"); + + // Get the last counter value, or start from 0 + let last = await client.get("__last__"); + last = last ? parseInt(last) : 0; + + console.log(`Starting from counter: ${last}`); + + // Write keys in sequence + for (let x = last + 1; x <= 1000000000; x++) { + try { + await client.set(`foo${x}`, x.toString()); + const value = await client.get(`foo${x}`); + console.log(value); + await client.set("__last__", x.toString()); + } catch (error) { + console.log(`Error: ${error.message}`); + } + + // Sleep for 100ms to make output readable + await new Promise(resolve => setTimeout(resolve, 100)); + } + } catch (error) { + if (error instanceof ConnectionError) { + console.log(`Connection error: ${error.message}`); + } else { + console.log(`Unexpected error: ${error.message}`); + } + } finally { + if (client) { + client.close(); + } + } } + +runExample().catch(console.error); ``` The application does a very simple thing, it sets keys in the form `foo` to `number`, one after the other. So if you run the program the result is the following stream of commands: * SET foo0 0 -* SET foo1 1 +* SET foo1 1 * SET foo2 2 * And so forth... -The program looks more complex than it should usually as it is designed to -show errors on the screen instead of exiting with an exception, so every -operation performed with the cluster is wrapped by `begin` `rescue` blocks. +The program includes comprehensive error handling to display errors instead of +crashing, so all cluster operations are wrapped in try-catch blocks. -The **line 14** is the first interesting line in the program. It creates the -Valkey Cluster object, using as argument a list of *startup nodes*, the maximum -number of connections this object is allowed to take against different nodes, -and finally the timeout after a given operation is considered to be failed. +The client creation on **line 18** is the first key part of the program. It creates the +Valkey cluster client using a list of *startup nodes* and configuration options +including a request timeout. The startup nodes don't need to be all the nodes of the cluster. The important -thing is that at least one node is reachable. Also note that redis-rb-cluster -updates this list of startup nodes as soon as it is able to connect with the -first node. You should expect such a behavior with any other serious client. - -Now that we have the Valkey Cluster object instance stored in the **rc** variable, -we are ready to use the object like if it was a normal Valkey object instance. +thing is that at least one node is reachable. Valkey GLIDE automatically +discovers the complete cluster topology once it connects to any node. -This is exactly what happens in **line 18 to 26**: when we restart the example -we don't want to start again with `foo0`, so we store the counter inside -Valkey itself. The code above is designed to read this counter, or if the -counter does not exist, to assign it the value of zero. +Now that we have the cluster client instance, we can use it like any other +Valkey client to perform operations across the cluster. -However note how it is a while loop, as we want to try again and again even -if the cluster is down and is returning errors. Normal applications don't need -to be so careful. +The code reads a counter from **line 27 to 29** so that when we restart the example +we don't start again with `foo0`, but continue from where we left off. +The counter is stored in Valkey itself using the key `__last__`. -**Lines between 28 and 37** start the main loop where the keys are set or -an error is displayed. +The main loop from **line 33 to 42** sets the keys sequentially and +displays either the value or any error that occurs. -Note the `sleep` call at the end of the loop. In your tests you can remove -the sleep if you want to write to the cluster as fast as possible (relatively +Note the `setTimeout` call at the end of the loop. In your tests you can remove +the sleep if you want to write to the cluster as fast as possible (though to the fact that this is a busy loop without real parallelism of course, so you'll get the usually 10k ops/second in the best of the conditions). @@ -457,7 +480,9 @@ easier to follow by humans. Starting the application produces the following output: ``` -ruby ./example.rb +node example.js +Connected to Valkey cluster +Starting from counter: 0 1 2 3 @@ -477,8 +502,8 @@ is running. #### Reshard the cluster Now we are ready to try a cluster resharding. To do this, please -keep the example.rb program running, so that you can see if there is some -impact on the program running. Also, you may want to comment the `sleep` +keep the example.js program running, so that you can see if there is some +impact on the program running. Also, you may want to comment the `setTimeout` call to have some more serious write load during resharding. Resharding basically means to move hash slots from a set of nodes to another @@ -564,8 +589,8 @@ From our point of view the cluster receiving the writes could just always write the key `foo` to `42` to every operation, and we would not notice at all. -So in the `redis-rb-cluster` repository, there is a more interesting application -that is called `consistency-test.rb`. It uses a set of counters, by default 1000, and sends `INCR` commands in order to increment the counters. +Now we can write a more interesting application for testing cluster behavior. +A simple consistency checking application that uses a set of counters, by default 1000, and sends `INCR` commands to increment the counters. However instead of just writing, the application does two additional things: @@ -578,11 +603,11 @@ a write that we did not receive acknowledgment for. In the first case we'll see a counter having a value that is smaller than the one we remember, while in the second case the value will be greater. -Running the consistency-test application produces a line of output every +Running a consistency testing application produces a line of output every second: ``` -$ ruby consistency-test.rb +node consistency-test.js 925 R (0 err) | 925 W (0 err) | 5030 R (0 err) | 5030 W (0 err) | 9261 R (0 err) | 9261 W (0 err) | From 450a79bf99d61841d4494fe10630dbcad7d30ad3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:56:01 +0000 Subject: [PATCH 03/25] Update Valkey GLIDE client configuration to match official API Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index d487b2892..3c2132346 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -397,9 +397,8 @@ async function runExample() { // Create cluster client client = await GlideClusterClient.createClient({ addresses: startupNodes, - clientConfiguration: { - requestTimeout: 100 - } + requestTimeout: 500, // 500ms timeout + clientName: "valkey_cluster_example" }); console.log("Connected to Valkey cluster"); @@ -451,9 +450,9 @@ following stream of commands: The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks. -The client creation on **line 18** is the first key part of the program. It creates the +The client creation on **line 398** is the first key part of the program. It creates the Valkey cluster client using a list of *startup nodes* and configuration options -including a request timeout. +including a request timeout and client name. The startup nodes don't need to be all the nodes of the cluster. The important thing is that at least one node is reachable. Valkey GLIDE automatically @@ -462,11 +461,11 @@ discovers the complete cluster topology once it connects to any node. Now that we have the cluster client instance, we can use it like any other Valkey client to perform operations across the cluster. -The code reads a counter from **line 27 to 29** so that when we restart the example +The code reads a counter from **line 407 to 408** so that when we restart the example we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The main loop from **line 33 to 42** sets the keys sequentially and +The main loop from **line 413 to 425** sets the keys sequentially and displays either the value or any error that occurs. Note the `setTimeout` call at the end of the loop. In your tests you can remove From ede5dbf7694db8ea8f83c41a46f13d5d753cc5dd Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:03:36 +0000 Subject: [PATCH 04/25] Fix Valkey GLIDE Node.js example based on API feedback Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 87 +++++++++++++++++++------------------- 1 file changed, 43 insertions(+), 44 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 3c2132346..3c8abea3e 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -380,23 +380,17 @@ import { GlideClusterClient, ConnectionError } from "@valkey/valkey-glide"; async function runExample() { // Define startup nodes - you only need one reachable node - let startupNodes = [ + const addresses = [ { host: "127.0.0.1", port: 7000 }, { host: "127.0.0.1", port: 7001 } ]; - // Handle command line arguments - if (process.argv.length === 4) { - const [,, host, port] = process.argv; - startupNodes = [{ host, port: parseInt(port) }]; - } - let client; try { // Create cluster client client = await GlideClusterClient.createClient({ - addresses: startupNodes, + addresses: addresses, requestTimeout: 500, // 500ms timeout clientName: "valkey_cluster_example" }); @@ -409,19 +403,33 @@ async function runExample() { console.log(`Starting from counter: ${last}`); - // Write keys in sequence - for (let x = last + 1; x <= 1000000000; x++) { + // Write keys in batches using mset for better performance + const batchSize = 100; + for (let start = last + 1; start <= 1000000000; start += batchSize) { try { - await client.set(`foo${x}`, x.toString()); - const value = await client.get(`foo${x}`); - console.log(value); - await client.set("__last__", x.toString()); + const batch = {}; + const end = Math.min(start + batchSize - 1, 1000000000); + + // Prepare batch of key-value pairs + for (let x = start; x <= end; x++) { + batch[`foo${x}`] = x.toString(); + } + + // Execute batch mset + await client.mset(batch); + + // Update counter and display progress + await client.set("__last__", end.toString()); + console.log(`Batch completed: ${start} to ${end}`); + + // Verify a sample key from the batch + const sampleKey = `foo${start}`; + const value = await client.get(sampleKey); + console.log(`Sample verification - ${sampleKey}: ${value}`); + } catch (error) { - console.log(`Error: ${error.message}`); + console.log(`Error in batch starting at ${start}: ${error.message}`); } - - // Sleep for 100ms to make output readable - await new Promise(resolve => setTimeout(resolve, 100)); } } catch (error) { if (error instanceof ConnectionError) { @@ -439,22 +447,21 @@ async function runExample() { runExample().catch(console.error); ``` -The application does a very simple thing, it sets keys in the form `foo` to `number`, one after the other. So if you run the program the result is the -following stream of commands: +The application does a very simple thing, it sets keys in the form `foo` to `number`, using batched MSET operations for better performance. So if you run the program the result is batches of +MSET commands: -* SET foo0 0 -* SET foo1 1 -* SET foo2 2 +* MSET foo1 1 foo2 2 foo3 3 ... (batch of 100 keys) +* MSET foo101 101 foo102 102 ... (next batch) * And so forth... The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks. The client creation on **line 398** is the first key part of the program. It creates the -Valkey cluster client using a list of *startup nodes* and configuration options +Valkey cluster client using a list of cluster *addresses* and configuration options including a request timeout and client name. -The startup nodes don't need to be all the nodes of the cluster. The important +The addresses don't need to be all the nodes of the cluster. The important thing is that at least one node is reachable. Valkey GLIDE automatically discovers the complete cluster topology once it connects to any node. @@ -465,16 +472,12 @@ The code reads a counter from **line 407 to 408** so that when we restart the ex we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The main loop from **line 413 to 425** sets the keys sequentially and -displays either the value or any error that occurs. - -Note the `setTimeout` call at the end of the loop. In your tests you can remove -the sleep if you want to write to the cluster as fast as possible (though -to the fact that this is a busy loop without real parallelism of course, so +The main loop from **line 413 onwards** sets keys in batches using MSET operations +for better performance, processing 100 keys at a time and displaying progress or +any errors that occur. you'll get the usually 10k ops/second in the best of the conditions). -Normally writes are slowed down in order for the example application to be -easier to follow by humans. +you'll get optimal performance). Starting the application produces the following output: @@ -482,15 +485,12 @@ Starting the application produces the following output: node example.js Connected to Valkey cluster Starting from counter: 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 +Batch completed: 1 to 100 +Sample verification - foo1: 1 +Batch completed: 101 to 200 +Sample verification - foo101: 101 +Batch completed: 201 to 300 +Sample verification - foo201: 201 ^C (I stopped the program here) ``` @@ -502,8 +502,7 @@ is running. Now we are ready to try a cluster resharding. To do this, please keep the example.js program running, so that you can see if there is some -impact on the program running. Also, you may want to comment the `setTimeout` -call to have some more serious write load during resharding. +impact on the program running. Resharding basically means to move hash slots from a set of nodes to another set of nodes. From 428e42533a33fb4b09ada2764988a85d18a7001c Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:08:41 +0000 Subject: [PATCH 05/25] Use ES6 shorthand property syntax for addresses parameter Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 3c8abea3e..d5984e693 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -390,7 +390,7 @@ async function runExample() { try { // Create cluster client client = await GlideClusterClient.createClient({ - addresses: addresses, + addresses, requestTimeout: 500, // 500ms timeout clientName: "valkey_cluster_example" }); From d290d0860a43c1ad33708b1488a6f284b806a18b Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:15:45 +0000 Subject: [PATCH 06/25] Replace line number references with descriptive section names in cluster tutorial Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index d5984e693..14594feee 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -457,7 +457,7 @@ MSET commands: The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks. -The client creation on **line 398** is the first key part of the program. It creates the +The **client creation section** is the first key part of the program. It creates the Valkey cluster client using a list of cluster *addresses* and configuration options including a request timeout and client name. @@ -468,11 +468,11 @@ discovers the complete cluster topology once it connects to any node. Now that we have the cluster client instance, we can use it like any other Valkey client to perform operations across the cluster. -The code reads a counter from **line 407 to 408** so that when we restart the example +The **counter initialization section** reads a counter so that when we restart the example we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The main loop from **line 413 onwards** sets keys in batches using MSET operations +The **main processing loop** sets keys in batches using MSET operations for better performance, processing 100 keys at a time and displaying progress or any errors that occur. you'll get the usually 10k ops/second in the best of the conditions). From 910843d09bd6fb372cede719eefcb30248ad1468 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:19:51 +0000 Subject: [PATCH 07/25] Validate and fix Valkey GLIDE Node.js API usage and text accuracy Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 37 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 14594feee..ab0ee0400 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -376,10 +376,10 @@ npm install @valkey/valkey-glide Here's the example code: ```javascript -import { GlideClusterClient, ConnectionError } from "@valkey/valkey-glide"; +import { GlideClusterClient } from "@valkey/valkey-glide"; async function runExample() { - // Define startup nodes - you only need one reachable node + // Define cluster addresses - you only need one reachable node const addresses = [ { host: "127.0.0.1", port: 7000 }, { host: "127.0.0.1", port: 7001 } @@ -388,11 +388,13 @@ async function runExample() { let client; try { - // Create cluster client + // Create cluster client with configuration client = await GlideClusterClient.createClient({ - addresses, - requestTimeout: 500, // 500ms timeout - clientName: "valkey_cluster_example" + addresses: addresses, + clientConfiguration: { + requestTimeout: 500, // 500ms timeout + clientName: "valkey_cluster_example" + } }); console.log("Connected to Valkey cluster"); @@ -407,16 +409,16 @@ async function runExample() { const batchSize = 100; for (let start = last + 1; start <= 1000000000; start += batchSize) { try { - const batch = {}; + const keyValuePairs = []; const end = Math.min(start + batchSize - 1, 1000000000); - // Prepare batch of key-value pairs + // Prepare batch of key-value pairs as array for (let x = start; x <= end; x++) { - batch[`foo${x}`] = x.toString(); + keyValuePairs.push(`foo${x}`, x.toString()); } - // Execute batch mset - await client.mset(batch); + // Execute batch mset with array format + await client.mset(keyValuePairs); // Update counter and display progress await client.set("__last__", end.toString()); @@ -432,11 +434,7 @@ async function runExample() { } } } catch (error) { - if (error instanceof ConnectionError) { - console.log(`Connection error: ${error.message}`); - } else { - console.log(`Unexpected error: ${error.message}`); - } + console.log(`Connection error: ${error.message}`); } finally { if (client) { client.close(); @@ -447,11 +445,10 @@ async function runExample() { runExample().catch(console.error); ``` -The application does a very simple thing, it sets keys in the form `foo` to `number`, using batched MSET operations for better performance. So if you run the program the result is batches of -MSET commands: +The application does a very simple thing, it sets keys in the form `foo` to `number`, using batched MSET operations for better performance. The MSET command accepts an array of alternating keys and values. So if you run the program the result is batches of MSET commands: -* MSET foo1 1 foo2 2 foo3 3 ... (batch of 100 keys) -* MSET foo101 101 foo102 102 ... (next batch) +* MSET foo1 1 foo2 2 foo3 3 ... foo100 100 (batch of 100 keys) +* MSET foo101 101 foo102 102 ... foo200 200 (next batch) * And so forth... The program includes comprehensive error handling to display errors instead of From 83d43fd1bfa364666ea4e5e53a6149b6161e1393 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:29:59 +0000 Subject: [PATCH 08/25] Update Valkey GLIDE client configuration to match official API format Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- _test/cluster-tutorial-final.html | 1283 ++++++++++++++++++++++++++++ _test/cluster-tutorial.html | 1284 +++++++++++++++++++++++++++++ topics/cluster-tutorial.md | 31 +- wordlist | 2 +- 4 files changed, 2582 insertions(+), 18 deletions(-) create mode 100644 _test/cluster-tutorial-final.html create mode 100644 _test/cluster-tutorial.html diff --git a/_test/cluster-tutorial-final.html b/_test/cluster-tutorial-final.html new file mode 100644 index 000000000..98e95dad7 --- /dev/null +++ b/_test/cluster-tutorial-final.html @@ -0,0 +1,1283 @@ + + + + + + + + Cluster tutorial + + + +
+

Cluster tutorial

+
+

Valkey scales horizontally with a deployment topology called Valkey +Cluster. This topic will teach you how to set up, test, and operate +Valkey Cluster in production. You will learn about the availability and +consistency characteristics of Valkey Cluster from the end user’s point +of view.

+

If you plan to run a production Valkey Cluster deployment or want to +understand better how Valkey Cluster works internally, consult the Valkey Cluster specification.

+

Valkey Cluster 101

+

Valkey Cluster provides a way to run a Valkey installation where data +is automatically sharded across multiple Valkey nodes. Valkey Cluster +also provides some degree of availability during partitions—in practical +terms, the ability to continue operations when some nodes fail or are +unable to communicate. However, the cluster will become unavailable in +the event of larger failures (for example, when the majority of +primaries are unavailable).

+

So, with Valkey Cluster, you get the ability to:

+
    +
  • Automatically split your dataset among multiple nodes.
  • +
  • Continue operations when a subset of the nodes are experiencing +failures or are unable to communicate with the rest of the cluster.
  • +
+

Valkey Cluster TCP ports

+

Every Valkey Cluster node requires two open TCP connections: a Valkey +TCP port used to serve clients, e.g., 6379, and second port known as the +cluster bus port. By default, the cluster bus port is set by +adding 10000 to the data port (e.g., 16379); however, you can override +this in the cluster-port configuration.

+

Cluster bus is a node-to-node communication channel that uses a +binary protocol, which is more suited to exchanging information between +nodes due to little bandwidth and processing time. Nodes use the cluster +bus for failure detection, configuration updates, failover +authorization, and so forth. Clients should never try to communicate +with the cluster bus port, but rather use the Valkey command port. +However, make sure you open both ports in your firewall, otherwise +Valkey cluster nodes won’t be able to communicate.

+

For a Valkey Cluster to work properly you need, for each node:

+
    +
  1. The client communication port (usually 6379) used to communicate +with clients and be open to all the clients that need to reach the +cluster, plus all the other cluster nodes that use the client port for +key migrations.
  2. +
  3. The cluster bus port must be reachable from all the other cluster +nodes.
  4. +
+

If you don’t open both TCP ports, your cluster will not work as +expected.

+

Valkey Cluster and Docker

+

Currently, Valkey Cluster does not support NATted environments and in +general environments where IP addresses or TCP ports are remapped.

+

Docker uses a technique called port mapping: programs +running inside Docker containers may be exposed with a different port +compared to the one the program believes to be using. This is useful for +running multiple containers using the same ports, at the same time, in +the same server.

+

To make Docker compatible with Valkey Cluster, you need to use +Docker’s host networking mode. Please see the +--net=host option in the Docker +documentation for more information.

+

Valkey Cluster data sharding

+

Valkey Cluster does not use consistent hashing, but a different form +of sharding where every key is conceptually part of what we call a +hash slot.

+

There are 16384 hash slots in Valkey Cluster, and to compute the hash +slot for a given key, we simply take the CRC16 of the key modulo +16384.

+

Every node in a Valkey Cluster is responsible for a subset of the +hash slots, so, for example, you may have a cluster with 3 nodes, +where:

+
    +
  • Node A contains hash slots from 0 to 5500.
  • +
  • Node B contains hash slots from 5501 to 11000.
  • +
  • Node C contains hash slots from 11001 to 16383.
  • +
+

This makes it easy to add and remove cluster nodes. For example, if I +want to add a new node D, I need to move some hash slots from nodes A, +B, C to D. Similarly, if I want to remove node A from the cluster, I can +just move the hash slots served by A to B and C. Once node A is empty, I +can remove it from the cluster completely.

+

Moving hash slots from a node to another does not require stopping +any operations; therefore, adding and removing nodes, or changing the +percentage of hash slots held by a node, requires no downtime.

+

Valkey Cluster supports multiple key operations as long as all of the +keys involved in a single command execution (or whole transaction, or +Lua script execution) belong to the same hash slot. The user can force +multiple keys to be part of the same hash slot by using a feature called +hash tags.

+

Hash tags are documented in the Valkey Cluster specification, but the +gist is that if there is a substring between {} brackets in a key, only +what is inside the string is hashed. For example, the keys +user:{123}:profile and user:{123}:account are +guaranteed to be in the same hash slot because they share the same hash +tag. As a result, you can operate on these two keys in the same +multi-key operation.

+

Valkey Cluster +primary-replica model

+

To remain available when a subset of primary nodes are failing or are +not able to communicate with the majority of nodes, Valkey Cluster uses +a primary-replica model where every hash slot has from 1 (the primary +itself) to N replicas (N-1 additional replica nodes).

+

In our example cluster with nodes A, B, C, if node B fails the +cluster is not able to continue, since we no longer have a way to serve +hash slots in the range 5501-11000.

+

However, when the cluster is created (or at a later time), we add a +replica node to every primary, so that the final cluster is composed of +A, B, C that are primary nodes, and A1, B1, C1 that are replica nodes. +This way, the system can continue if node B fails.

+

Node B1 replicates B, and B fails, the cluster will promote node B1 +as the new primary and will continue to operate correctly.

+

However, note that if nodes B and B1 fail at the same time, Valkey +Cluster will not be able to continue to operate.

+

Valkey Cluster +consistency guarantees

+

Valkey Cluster does not guarantee strong +consistency. In practical terms this means that under certain +conditions it is possible that Valkey Cluster will lose writes that were +acknowledged by the system to the client.

+

The first reason why Valkey Cluster can lose writes is because it +uses asynchronous replication. This means that during writes the +following happens:

+
    +
  • Your client writes to the primary B.
  • +
  • The primary B replies OK to your client.
  • +
  • The primary B propagates the write to its replicas B1, B2 and +B3.
  • +
+

As you can see, B does not wait for an acknowledgement from B1, B2, +B3 before replying to the client, since this would be a prohibitive +latency penalty for Valkey, so if your client writes something, B +acknowledges the write, but crashes before being able to send the write +to its replicas, one of the replicas (that did not receive the write) +can be promoted to primary, losing the write forever.

+

This is very similar to what happens with most databases that are +configured to flush data to disk every second, so it is a scenario you +are already able to reason about because of past experiences with +traditional database systems not involving distributed systems. +Similarly you can improve consistency by forcing the database to flush +data to disk before replying to the client, but this usually results in +prohibitively low performance. That would be the equivalent of +synchronous replication in the case of Valkey Cluster.

+

Basically, there is a trade-off to be made between performance and +consistency.

+

Valkey Cluster has support for synchronous writes when absolutely +needed, implemented via the WAIT command. This makes losing +writes a lot less likely. However, note that Valkey Cluster does not +implement strong consistency even when synchronous replication is used: +it is always possible, under more complex failure scenarios, that a +replica that was not able to receive the write will be elected as +primary.

+

There is another notable scenario where Valkey Cluster will lose +writes, that happens during a network partition where a client is +isolated with a minority of instances including at least a primary.

+

Take as an example our 6 nodes cluster composed of A, B, C, A1, B1, +C1, with 3 primaries and 3 replicas. There is also a client, that we +will call Z1.

+

After a partition occurs, it is possible that in one side of the +partition we have A, C, A1, B1, C1, and in the other side we have B and +Z1.

+

Z1 is still able to write to B, which will accept its writes. If the +partition heals in a very short time, the cluster will continue +normally. However, if the partition lasts enough time for B1 to be +promoted to primary on the majority side of the partition, the writes +that Z1 has sent to B in the meantime will be lost.

+

Note: There is a maximum window to +the amount of writes Z1 will be able to send to B: if enough time has +elapsed for the majority side of the partition to elect a replica as +primary, every primary node in the minority side will have stopped +accepting writes.

+

This amount of time is a very important configuration directive of +Valkey Cluster, and is called the node timeout.

+

After node timeout has elapsed, a primary node is considered to be +failing, and can be replaced by one of its replicas. Similarly, after +node timeout has elapsed without a primary node to be able to sense the +majority of the other primary nodes, it enters an error state and stops +accepting writes.

+

Valkey Cluster +configuration parameters

+

We are about to create an example cluster deployment. Before we +continue, let’s introduce the configuration parameters that Valkey +Cluster introduces in the valkey.conf file.

+
    +
  • cluster-enabled <yes/no>: If +yes, enables Valkey Cluster support in a specific Valkey instance. +Otherwise the instance starts as a standalone instance as usual.
  • +
  • cluster-config-file <filename>: +Note that despite the name of this option, this is not a user editable +configuration file, but the file where a Valkey Cluster node +automatically persists the cluster configuration (the state, basically) +every time there is a change, in order to be able to re-read it at +startup. The file lists things like the other nodes in the cluster, +their state, persistent variables, and so forth. Often this file is +rewritten and flushed on disk as a result of some message +reception.
  • +
  • cluster-node-timeout +<milliseconds>: The maximum amount of time a +Valkey Cluster node can be unavailable, without it being considered as +failing. If a primary node is not reachable for more than the specified +amount of time, it will be failed over by its replicas. This parameter +controls other important things in Valkey Cluster. Notably, every node +that can’t reach the majority of primary nodes for the specified amount +of time, will stop accepting queries.
  • +
  • cluster-replica-validity-factor +<factor>: If set to zero, a replica will +always consider itself valid, and will therefore always try to failover +a primary, regardless of the amount of time the link between the primary +and the replica remained disconnected. If the value is positive, a +maximum disconnection time is calculated as the node timeout +value multiplied by the factor provided with this option, and if the +node is a replica, it will not try to start a failover if the primary +link was disconnected for more than the specified amount of time. For +example, if the node timeout is set to 5 seconds and the validity factor +is set to 10, a replica disconnected from the primary for more than 50 +seconds will not try to failover its primary. Note that any value +different than zero may result in Valkey Cluster being unavailable after +a primary failure if there is no replica that is able to failover it. In +that case the cluster will return to being available only when the +original primary rejoins the cluster.
  • +
  • cluster-migration-barrier +<count>: Minimum number of replicas a +primary will remain connected with, for another replica to migrate to a +primary which is no longer covered by any replica. See the appropriate +section about replica migration in this tutorial for more +information.
  • +
  • cluster-require-full-coverage +<yes/no>: If this is set to yes, as it is by +default, the cluster stops accepting writes if some percentage of the +key space is not covered by any node. If the option is set to no, the +cluster will still serve queries even if only requests about a subset of +keys can be processed.
  • +
  • cluster-allow-reads-when-down +<yes/no>: If this is set to no, as it is by +default, a node in a Valkey Cluster will stop serving all traffic when +the cluster is marked as failed, either when a node can’t reach a quorum +of primaries or when full coverage is not met. This prevents reading +potentially inconsistent data from a node that is unaware of changes in +the cluster. This option can be set to yes to allow reads from a node +during the fail state, which is useful for applications that want to +prioritize read availability but still want to prevent inconsistent +writes. It can also be used for when using Valkey Cluster with only one +or two shards, as it allows the nodes to continue serving writes when a +primary fails but automatic failover is impossible.
  • +
+

Create and use a Valkey +Cluster

+

To create and use a Valkey Cluster, follow these steps:

+ +

But, first, familiarize yourself with the requirements for creating a +cluster.

+

Requirements to create +a Valkey Cluster

+

To create a cluster, the first thing you need is to have a few empty +Valkey instances running in cluster mode.

+

At minimum, set the following directives in the +valkey.conf file:

+
port 7000
+cluster-enabled yes
+cluster-config-file nodes.conf
+cluster-node-timeout 5000
+appendonly yes
+

To enable cluster mode, set the cluster-enabled +directive to yes. Every instance also contains the path of +a file where the configuration for this node is stored, which by default +is nodes.conf. This file is never touched by humans; it is +simply generated at startup by the Valkey Cluster instances, and updated +every time it is needed.

+

Note that the minimal cluster that works as expected +must contain at least three primary nodes. For deployment, we strongly +recommend a six-node cluster, with three primaries and three +replicas.

+

You can test this locally by creating the following directories named +after the port number of the instance you’ll run inside any given +directory.

+

For example:

+
mkdir cluster-test
+cd cluster-test
+mkdir 7000 7001 7002 7003 7004 7005
+

Create a valkey.conf file inside each of the +directories, from 7000 to 7005. As a template for your configuration +file just use the small example above, but make sure to replace the port +number 7000 with the right port number according to the +directory name.

+

You can start each instance as follows, each running in a separate +terminal tab:

+
cd 7000
+valkey-server ./valkey.conf
+

You’ll see from the logs that every node assigns itself a new ID:

+
[82462] 26 Nov 11:56:55.329 * No cluster configuration found, I'm 97a3a64667477371c4479320d683e4c8db5858b1
+

This ID will be used forever by this specific instance in order for +the instance to have a unique name in the context of the cluster. Every +node remembers every other node using this IDs, and not by IP or port. +IP addresses and ports may change, but the unique node identifier will +never change for all the life of the node. We call this identifier +simply Node ID.

+

Create a Valkey Cluster

+

Now that we have a number of instances running, you need to create +your cluster by writing some meaningful configuration to the nodes.

+

You can configure and execute individual instances manually or use +the create-cluster script. Let’s go over how you do it manually.

+

To create the cluster, run:

+
valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
+127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
+--cluster-replicas 1
+

The command used here is create, since we want to +create a new cluster. The option --cluster-replicas 1 means +that we want a replica for every primary created.

+

The other arguments are the list of addresses of the instances I want +to use to create the new cluster.

+

valkey-cli will propose a configuration. Accept the +proposed configuration by typing yes. The cluster will +be configured and joined, which means that instances will be +bootstrapped into talking with each other. Finally, if everything has +gone well, you’ll see a message like this:

+
[OK] All 16384 slots covered
+

This means that there is at least one primary instance serving each +of the 16384 available slots.

+

If you don’t want to create a Valkey Cluster by configuring and +executing individual instances manually as explained above, there is a +much simpler system (but you’ll not learn the same amount of operational +details).

+

Find the utils/create-cluster directory in the Valkey +distribution. There is a script called create-cluster +inside (same name as the directory it is contained into), it’s a simple +bash script. In order to start a 6 nodes cluster with 3 primaries and 3 +replicas just type the following commands:

+
    +
  1. create-cluster start
  2. +
  3. create-cluster create
  4. +
+

Reply to yes in step 2 when the valkey-cli +utility wants you to accept the cluster layout.

+

You can now interact with the cluster, the first node will start at +port 30001 by default. When you are done, stop the cluster with:

+
    +
  1. create-cluster stop
  2. +
+

Please read the README inside this directory for more +information on how to run the script.

+

Interact with the cluster

+

To connect to Valkey Cluster, you’ll need a cluster-aware Valkey +client. See the documentation for your client of +choice to determine its cluster support.

+

You can also test your Valkey Cluster using the +valkey-cli command line utility:

+
$ valkey-cli -c -p 7000
+127.0.0.1:7000> set foo bar
+-> Redirected to slot [12182] located at 127.0.0.1:7002
+OK
+127.0.0.1:7002> set hello world
+-> Redirected to slot [866] located at 127.0.0.1:7000
+OK
+127.0.0.1:7000> get foo
+-> Redirected to slot [12182] located at 127.0.0.1:7002
+"bar"
+127.0.0.1:7002> get hello
+-> Redirected to slot [866] located at 127.0.0.1:7000
+"world"
+

Note: If you created the cluster using the script, +your nodes may listen on different ports, starting from 30001 by +default.

+

The valkey-cli cluster support is very basic, so it +always uses the fact that Valkey Cluster nodes are able to redirect a +client to the right node. A serious client is able to do better than +that, and cache the map between hash slots and nodes addresses, to +directly use the right connection to the right node. The map is +refreshed only when something changed in the cluster configuration, for +example after a failover or after the system administrator changed the +cluster layout by adding or removing nodes.

+

Write an example app +with Valkey GLIDE

+

Before going forward showing how to operate the Valkey Cluster, doing +things like a failover, or a resharding, we need to create some example +application or at least to be able to understand the semantics of a +simple Valkey Cluster client interaction.

+

In this way we can run an example and at the same time try to make +nodes failing, or start a resharding, to see how Valkey Cluster behaves +under real world conditions. It is not very helpful to see what happens +while nobody is writing to the cluster.

+

This section explains some basic usage of Valkey +GLIDE for Node.js, the official Valkey client library, showing a +simple example application.

+

The following example demonstrates how to connect to a Valkey cluster +and perform basic operations. First, install the Valkey GLIDE +client:

+
npm install @valkey/valkey-glide
+

Here’s the example code:

+
import { GlideClusterClient } from "@valkey/valkey-glide";
+
+async function runExample() {
+    const addresses = [
+        {
+            host: "localhost",
+            port: 6379,
+        },
+    ];
+    // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options.
+    const client = await GlideClusterClient.createClient({
+        addresses: addresses,
+        // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently.
+        // useTLS: true,
+        // It is recommended to set a timeout for your specific use case
+        requestTimeout: 500, // 500ms timeout
+        clientName: "test_cluster_client",
+    });
+
+    try {
+        console.log("Connected to Valkey cluster");
+
+        // Get the last counter value, or start from 0
+        let last = await client.get("__last__");
+        last = last ? parseInt(last) : 0;
+
+        console.log(`Starting from counter: ${last}`);
+
+        // Write keys in batches using mset for better performance
+        const batchSize = 100;
+        for (let start = last + 1; start <= 1000000000; start += batchSize) {
+            try {
+                const keyValuePairs = [];
+                const end = Math.min(start + batchSize - 1, 1000000000);
+                
+                // Prepare batch of key-value pairs as array
+                for (let x = start; x <= end; x++) {
+                    keyValuePairs.push(`foo${x}`, x.toString());
+                }
+                
+                // Execute batch mset with array format
+                await client.mset(keyValuePairs);
+                
+                // Update counter and display progress
+                await client.set("__last__", end.toString());
+                console.log(`Batch completed: ${start} to ${end}`);
+                
+                // Verify a sample key from the batch
+                const sampleKey = `foo${start}`;
+                const value = await client.get(sampleKey);
+                console.log(`Sample verification - ${sampleKey}: ${value}`);
+                
+            } catch (error) {
+                console.log(`Error in batch starting at ${start}: ${error.message}`);
+            }
+        }
+    } catch (error) {
+        console.log(`Connection error: ${error.message}`);
+    } finally {
+        client.close();
+    }
+}
+
+runExample().catch(console.error);
+

The application does a very simple thing, it sets keys in the form +foo<number> to number, using batched +MSET operations for better performance. The MSET command accepts an +array of alternating keys and values. So if you run the program the +result is batches of MSET commands:

+
    +
  • MSET foo1 1 foo2 2 foo3 3 … foo100 100 (batch of 100 keys)
  • +
  • MSET foo101 101 foo102 102 … foo200 200 (next batch)
  • +
  • And so forth…
  • +
+

The program includes comprehensive error handling to display errors +instead of crashing, so all cluster operations are wrapped in try-catch +blocks.

+

The client creation section is the first key part of +the program. It creates the Valkey cluster client using a list of +cluster addresses and configuration options including a request +timeout and client name.

+

The addresses don’t need to be all the nodes of the cluster. The +important thing is that at least one node is reachable. Valkey GLIDE +automatically discovers the complete cluster topology once it connects +to any node.

+

Now that we have the cluster client instance, we can use it like any +other Valkey client to perform operations across the cluster.

+

The counter initialization section reads a counter +so that when we restart the example we don’t start again with +foo0, but continue from where we left off. The counter is +stored in Valkey itself using the key __last__.

+

The main processing loop sets keys in batches using +MSET operations for better performance, processing 100 keys at a time +and displaying progress or any errors that occur. you’ll get the usually +10k ops/second in the best of the conditions).

+

you’ll get optimal performance).

+

Starting the application produces the following output:

+
node example.js
+Connected to Valkey cluster
+Starting from counter: 0
+Batch completed: 1 to 100
+Sample verification - foo1: 1
+Batch completed: 101 to 200
+Sample verification - foo101: 101
+Batch completed: 201 to 300
+Sample verification - foo201: 201
+^C (I stopped the program here)
+

This is not a very interesting program and we’ll use a better one in +a moment but we can already see what happens during a resharding when +the program is running.

+

Reshard the cluster

+

Now we are ready to try a cluster resharding. To do this, please keep +the example.js program running, so that you can see if there is some +impact on the program running.

+

Resharding basically means to move hash slots from a set of nodes to +another set of nodes. Like cluster creation, it is accomplished using +the valkey-cli utility.

+

To start a resharding, just type:

+
valkey-cli --cluster reshard 127.0.0.1:7000
+

You only need to specify a single node, valkey-cli will find the +other nodes automatically.

+

Currently valkey-cli is only able to reshard with the administrator +support, you can’t just say move 5% of slots from this node to the other +one (but this is pretty trivial to implement). So it starts with +questions. The first is how much of a resharding do you want to do:

+
How many slots do you want to move (from 1 to 16384)?
+

We can try to reshard 1000 hash slots, that should already contain a +non trivial amount of keys if the example is still running without the +sleep call.

+

Then valkey-cli needs to know what is the target of the resharding, +that is, the node that will receive the hash slots. I’ll use the first +primary node, that is, 127.0.0.1:7000, but I need to specify the Node ID +of the instance. This was already printed in a list by valkey-cli, but I +can always find the ID of a node with the following command if I +need:

+
$ valkey-cli -p 7000 cluster nodes | grep myself
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5460
+

Ok so my target node is 97a3a64667477371c4479320d683e4c8db5858b1.

+

Now you’ll get asked from what nodes you want to take those keys. +I’ll just type all in order to take a bit of hash slots +from all the other primary nodes.

+

After the final confirmation you’ll see a message for every slot that +valkey-cli is going to move from a node to another, and a dot will be +printed for every actual key moved from one side to the other.

+

While the resharding is in progress you should be able to see your +example program running unaffected. You can stop and restart it multiple +times during the resharding if you want.

+

At the end of the resharding, you can test the health of the cluster +with the following command:

+
valkey-cli --cluster check 127.0.0.1:7000
+

All the slots will be covered as usual, but this time the primary at +127.0.0.1:7000 will have more hash slots, something around 6461.

+

Resharding can be performed automatically without the need to +manually enter the parameters in an interactive way. This is possible +using a command line like the following:

+
valkey-cli --cluster reshard <host>:<port> --cluster-from <node-id> --cluster-to <node-id> --cluster-slots <number of slots> --cluster-yes
+

This allows to build some automatism if you are likely to reshard +often, however currently there is no way for valkey-cli to +automatically rebalance the cluster checking the distribution of keys +across the cluster nodes and intelligently moving slots as needed. This +feature will be added in the future.

+

The --cluster-yes option instructs the cluster manager +to automatically answer “yes” to the command’s prompts, allowing it to +run in a non-interactive mode. Note that this option can also be +activated by setting the REDISCLI_CLUSTER_YES environment +variable.

+

A more interesting +example application

+

The example application we wrote early is not very good. It writes to +the cluster in a simple way without even checking if what was written is +the right thing.

+

From our point of view the cluster receiving the writes could just +always write the key foo to 42 to every +operation, and we would not notice at all.

+

Now we can write a more interesting application for testing cluster +behavior. A simple consistency checking application that uses a set of +counters, by default 1000, and sends INCR commands to +increment the counters.

+

However instead of just writing, the application does two additional +things:

+
    +
  • When a counter is updated using INCR, the application +remembers the write.
  • +
  • It also reads a random counter before every write, and check if the +value is what we expected it to be, comparing it with the value it has +in memory.
  • +
+

What this means is that this application is a simple +consistency checker, and is able to tell you if the +cluster lost some write, or if it accepted a write that we did not +receive acknowledgment for. In the first case we’ll see a counter having +a value that is smaller than the one we remember, while in the second +case the value will be greater.

+

Running a consistency testing application produces a line of output +every second:

+
node consistency-test.js
+925 R (0 err) | 925 W (0 err) |
+5030 R (0 err) | 5030 W (0 err) |
+9261 R (0 err) | 9261 W (0 err) |
+13517 R (0 err) | 13517 W (0 err) |
+17780 R (0 err) | 17780 W (0 err) |
+22025 R (0 err) | 22025 W (0 err) |
+25818 R (0 err) | 25818 W (0 err) |
+

The line shows the number of Reads and +Writes performed, and the number of errors (query not +accepted because of errors since the system was not available).

+

If some inconsistency is found, new lines are added to the output. +This is what happens, for example, if I reset a counter manually while +the program is running:

+
$ valkey-cli -h 127.0.0.1 -p 7000 set key_217 0
+OK
+
+(in the other tab I see...)
+
+94774 R (0 err) | 94774 W (0 err) |
+98821 R (0 err) | 98821 W (0 err) |
+102886 R (0 err) | 102886 W (0 err) | 114 lost |
+107046 R (0 err) | 107046 W (0 err) | 114 lost |
+

When I set the counter to 0 the real value was 114, so the program +reports 114 lost writes (INCR commands that are not +remembered by the cluster).

+

This program is much more interesting as a test case, so we’ll use it +to test the Valkey Cluster failover.

+

Test the failover

+

To trigger the failover, the simplest thing we can do (that is also +the semantically simplest failure that can occur in a distributed +system) is to crash a single process, in our case a single primary.

+

Note: During this test, you should take a tab open +with the consistency test application running.

+

We can identify a primary and crash it with the following +command:

+
$ valkey-cli -p 7000 cluster nodes | grep master
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385482984082 0 connected 5960-10921
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 master - 0 1385482983582 0 connected 11423-16383
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
+

Ok, so 7000, 7001, and 7002 are primaries. Let’s crash node 7002 with +the DEBUG SEGFAULT command:

+
$ valkey-cli -p 7002 debug segfault
+Error: Server closed the connection
+

Now we can look at the output of the consistency test to see what it +reported.

+
18849 R (0 err) | 18849 W (0 err) |
+23151 R (0 err) | 23151 W (0 err) |
+27302 R (0 err) | 27302 W (0 err) |
+
+... many error warnings here ...
+
+29659 R (578 err) | 29660 W (577 err) |
+33749 R (578 err) | 33750 W (577 err) |
+37918 R (578 err) | 37919 W (577 err) |
+42077 R (578 err) | 42078 W (577 err) |
+

As you can see during the failover the system was not able to accept +578 reads and 577 writes, however no inconsistency was created in the +database. This may sound unexpected as in the first part of this +tutorial we stated that Valkey Cluster can lose writes during the +failover because it uses asynchronous replication. What we did not say +is that this is not very likely to happen because Valkey sends the reply +to the client, and the commands to replicate to the replicas, about at +the same time, so there is a very small window to lose data. However the +fact that it is hard to trigger does not mean that it is impossible, so +this does not change the consistency guarantees provided by Valkey +cluster.

+

We can now check what is the cluster setup after the failover (note +that in the meantime I restarted the crashed instance so that it rejoins +the cluster as a replica):

+
$ valkey-cli -p 7000 cluster nodes
+3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385503418521 0 connected
+a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385503419023 0 connected
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
+3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385503419023 3 connected 11423-16383
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385503417005 0 connected 5960-10921
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385503418016 3 connected
+

Now the primaries are running on ports 7000, 7001 and 7005. What was +previously a primary, that is the Valkey instance running on port 7002, +is now a replica of 7005.

+

The output of the CLUSTER NODES command may look +intimidating, but it is actually pretty simple, and is composed of the +following tokens:

+
    +
  • Node ID
  • +
  • ip:port
  • +
  • flags: master, replica, myself, fail, …
  • +
  • if it is a replica, the Node ID of the master
  • +
  • Time of the last pending PING still waiting for a reply.
  • +
  • Time of the last PONG received.
  • +
  • Configuration epoch for this node (see the Cluster +specification).
  • +
  • Status of the link to this node.
  • +
  • Slots served…
  • +
+

Manual failover

+

Sometimes it is useful to force a failover without actually causing +any problem on a primary. For example, to upgrade the Valkey process of +one of the primary nodes it is a good idea to failover it to turn it +into a replica with minimal impact on availability.

+

Manual failovers are supported by Valkey Cluster using the +CLUSTER FAILOVER command, that must be executed in one of +the replicas of the primary you want to failover.

+

Manual failovers are special and are safer compared to failovers +resulting from actual primary failures. They occur in a way that avoids +data loss in the process, by switching clients from the original primary +to the new primary only when the system is sure that the new primary +processed all the replication stream from the old one.

+

This is what you see in the replica log when you perform a manual +failover:

+
# Manual failover user request accepted.
+# Received replication offset for paused primary manual failover: 347540
+# All primary replication stream processed, manual failover can start.
+# Start of election delayed for 0 milliseconds (rank #0, offset 347540).
+# Starting a failover election for epoch 7545.
+# Failover election won: I'm the new primary.
+

Clients sending write commands to the primary are blocked during the +failover. When the primary sends its replication offset to the replica, +the replica waits to reach the offset on its side. When the replication +offset is reached, the failover starts, and the old primary is informed +about the configuration switch. When the switch is complete, the clients +are unblocked on the old primary and they are redirected to the new +primary.

+

Note: To promote a replica to primary, it must first +be known as a replica by a majority of the primaries in the cluster. +Otherwise, it cannot win the failover election. If the replica has just +been added to the cluster (see Add a new node as a replica), +you may need to wait a while before sending the +CLUSTER FAILOVER command, to make sure the primaries in +cluster are aware of the new replica.

+

Add a new node

+

Adding a new node is basically the process of adding an empty node +and then moving some data into it, in case it is a new primary, or +telling it to setup as a replica of a known node, in case it is a +replica.

+

We’ll show both, starting with the addition of a new primary +instance.

+

In both cases the first step to perform is adding an empty +node.

+

This is as simple as to start a new node in port 7006 (we already +used from 7000 to 7005 for our existing 6 nodes) with the same +configuration used for the other nodes, except for the port number, so +what you should do in order to conform with the setup we used for the +previous nodes:

+
    +
  • Create a new tab in your terminal application.
  • +
  • Enter the cluster-test directory.
  • +
  • Create a directory named 7006.
  • +
  • Create a valkey.conf file inside, similar to the one used for the +other nodes but using 7006 as port number.
  • +
  • Finally start the server with +../valkey-server ./valkey.conf
  • +
+

At this point the server should be running.

+

Now we can use valkey-cli as usual in order to add +the node to the existing cluster.

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
+

As you can see I used the add-node command +specifying the address of the new node as first argument, and the +address of a random existing node in the cluster as second argument.

+

In practical terms valkey-cli here did very little to help us, it +just sent a CLUSTER MEET message to the node, something +that is also possible to accomplish manually. However valkey-cli also +checks the state of the cluster before to operate, so it is a good idea +to perform cluster operations always via valkey-cli even when you know +how the internals work.

+

Now we can connect to the new node to see if it really joined the +cluster:

+
valkey 127.0.0.1:7006> cluster nodes
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385543178575 0 connected 5960-10921
+3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385543179583 0 connected
+f093c80dde814da99c5cf72a7dd01590792b783b :0 myself,master - 0 0 0 connected
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543178072 3 connected
+a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385543178575 0 connected
+97a3a64667477371c4479320d683e4c8db5858b1 127.0.0.1:7000 master - 0 1385543179080 0 connected 0-5959 10922-11422
+3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385543177568 3 connected 11423-16383
+

Note that since this node is already connected to the cluster it is +already able to redirect client queries correctly and is generally +speaking part of the cluster. However it has two peculiarities compared +to the other primaries:

+
    +
  • It holds no data as it has no assigned hash slots.
  • +
  • Because it is a primary without assigned slots, it does not +participate in the election process when a replica wants to become a +primary.
  • +
+

Now it is possible to assign hash slots to this node using the +resharding feature of valkey-cli. It is basically useless +to show this as we already did in a previous section, there is no +difference, it is just a resharding having as a target the empty +node.

+
Add a new node as a replica
+

Adding a new replica can be performed in two ways. The obvious one is +to use valkey-cli again, but with the –cluster-replica option, like +this:

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica
+

Note that the command line here is exactly like the one we used to +add a new primary, so we are not specifying to which primary we want to +add the replica. In this case, what happens is that valkey-cli will add +the new node as replica of a random primary among the primaries with +fewer replicas.

+

However you can specify exactly what primary you want to target with +your new replica with the following command line:

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+

This way we assign the new replica to a specific primary.

+

A more manual way to add a replica to a specific primary is to add +the new node as an empty primary, and then turn it into a replica using +the CLUSTER REPLICATE command. This also works if the node +was added as a replica but you want to move it as a replica of a +different primary.

+

For example in order to add a replica for the node 127.0.0.1:7005 +that is currently serving hash slots in the range 11423-16383, that has +a Node ID 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e, all I need to do is +to connect with the new node (already added as empty primary) and send +the command:

+
valkey 127.0.0.1:7006> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+

That’s it. Now we have a new replica for this set of hash slots, and +all the other nodes in the cluster already know (after a few seconds +needed to update their config). We can verify with the following +command:

+
$ valkey-cli -p 7000 cluster nodes | grep slave | grep 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+f093c80dde814da99c5cf72a7dd01590792b783b 127.0.0.1:7006 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617702 3 connected
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617198 3 connected
+

The node 3c3a0c… now has two replicas, running on ports 7002 (the +existing one) and 7006 (the new one).

+

Remove a node

+

To remove a replica node just use the del-node command +of valkey-cli:

+
valkey-cli --cluster del-node 127.0.0.1:7000 `<node-id>`
+

The first argument is just a random node in the cluster, the second +argument is the ID of the node you want to remove.

+

You can remove a primary node in the same way as well, +however in order to remove a primary node it must be +empty. If the primary is not empty you need to reshard data +away from it to all the other primary nodes before.

+

An alternative to remove a primary node is to perform a manual +failover of it over one of its replicas and remove the node after it +turned into a replica of the new primary. Obviously this does not help +when you want to reduce the actual number of primaries in your cluster, +in that case, a resharding is needed.

+

There is a special scenario where you want to remove a failed node. +You should not use the del-node command because it tries to +connect to all nodes and you will encounter a “connection refused” +error. Instead, you can use the call command:

+
valkey-cli --cluster call 127.0.0.1:7000 cluster forget `<node-id>`
+

This command will execute CLUSTER FORGET command on +every node.

+

Replica migration

+

In Valkey Cluster, you can reconfigure a replica to replicate with a +different primary at any time just using this command:

+
CLUSTER REPLICATE <master-node-id>
+

However there is a special scenario where you want replicas to move +from one primary to another one automatically, without the help of the +system administrator. The automatic reconfiguration of replicas is +called replicas migration and is able to improve the +reliability of a Valkey Cluster.

+

Note: You can read the details of replicas migration +in the Valkey Cluster Specification, here +we’ll only provide some information about the general idea and what you +should do in order to benefit from it.

+

The reason why you may want to let your cluster replicas to move from +one primary to another under certain condition, is that usually the +Valkey Cluster is as resistant to failures as the number of replicas +attached to a given primary.

+

For example a cluster where every primary has a single replica can’t +continue operations if the primary and its replica fail at the same +time, simply because there is no other instance to have a copy of the +hash slots the primary was serving. However while net-splits are likely +to isolate a number of nodes at the same time, many other kind of +failures, like hardware or software failures local to a single node, are +a very notable class of failures that are unlikely to happen at the same +time, so it is possible that in your cluster where every primary has a +replica, the replica is killed at 4am, and the primary is killed at 6am. +This still will result in a cluster that can no longer operate.

+

To improve reliability of the system we have the option to add +additional replicas to every primary, but this is expensive. Replica +migration allows to add more replicas to just a few primaries. So you +have 10 primaries with 1 replica each, for a total of 20 instances. +However you add, for example, 3 instances more as replicas of some of +your primaries, so certain primaries will have more than a single +replica.

+

With replicas migration what happens is that if a primary is left +without replicas, a replica from a primary that has multiple replicas +will migrate to the orphaned primary. So after your replica +goes down at 4am as in the example we made above, another replica will +take its place, and when the primary will fail as well at 5am, there is +still a replica that can be elected so that the cluster can continue to +operate.

+

So what you should know about replicas migration in short?

+
    +
  • The cluster will try to migrate a replica from the primary that has +the greatest number of replicas in a given moment.
  • +
  • To benefit from replica migration you have just to add a few more +replicas to a single primary in your cluster, it does not matter what +primary.
  • +
  • There is a configuration parameter that controls the replica +migration feature that is called cluster-migration-barrier: +you can read more about it in the example valkey.conf file +provided with Valkey Cluster.
  • +
+

Upgrade nodes in a Valkey +Cluster

+

Upgrading replica nodes is easy since you just need to stop the node +and restart it with an updated version of Valkey. If there are clients +scaling reads using replica nodes, they should be able to reconnect to a +different replica if a given one is not available.

+

Upgrading primaries is a bit more complex. The suggested procedure is +to trigger a manual failover to turn the old primary into a replica and +then upgrading it.

+

A complete rolling upgrade of all nodes in a cluster can be performed +by repeating the following procedure for each shard (a primary and its +replicas):

+
    +
  1. Add one or more upgraded nodes as new replicas to the primary. +This step is optional but it ensures that the number of replicas is not +compromised during the rolling upgrade. To add a new node, use CLUSTER MEET and +CLUSTER REPLICATE +or use valkey-cli as described under Add a new node as a replica.

    +

    An alternative is to upgrade one replica at a time and have fewer +replicas online during the upgrade.

  2. +
  3. Upgrade the old replicas you want to keep by restarting them with +the updated version of Valkey. If you’re replacing all the old nodes +with new nodes, you can skip this step.

  4. +
  5. Select one of the upgraded replicas to be the new primary. Wait +until this replica has caught up the replication offset with the +primary. You can use INFO REPLICATION and check +for the line master_link_status:up to be present. This +indicates that the initial sync with the primary is complete.

    +

    After the initial full sync, the replica might still lag behind in +replication. Send INFO REPLICATION to the primary and the +replica and compare the field master_repl_offset returned +by both nodes. If the offsets match, it means that all writes have been +replicated. However, if the primary receives a constant stream of +writes, it’s possible that the offsets will never be equal. In this +step, you can accept a small difference. It’s usually enough to wait for +some seconds to minimize the difference.

  6. +
  7. Check that the new replica is known by all nodes in the cluster, +or at least by the primaries in the cluster. You can send CLUSTER NODES to +each of the nodes in the cluster and check that they all are aware of +the new node. Wait for some time and repeat the check if +necessary.

  8. +
  9. Trigger a manual failover by sending CLUSTER FAILOVER +to the replica node selected to become the new primary. See the Manual failover section in this document for +more information.

  10. +
  11. Wait for the failover to complete. To check, you can use ROLE, INFO REPLICATION (which +indicates role:master after successful failover) or CLUSTER NODES to +verify that the state of the cluster has changed shortly after the +command was sent.

  12. +
  13. Take the old primary (now a replica) out of service, or upgrade +it and add it again as a replica. Remove additional replicas kept for +redundancy during the upgrade, if any.

  14. +
+

Repeat this sequence for each shard (each primary and its replicas) +until all nodes in the cluster have been upgraded.

+

Migrate to Valkey Cluster

+

Users willing to migrate to Valkey Cluster may have just a single +primary, or may already using a preexisting sharding setup, where keys +are split among N nodes, using some in-house algorithm or a sharding +algorithm implemented by their client library or Valkey proxy.

+

In both cases it is possible to migrate to Valkey Cluster easily, +however what is the most important detail is if multiple-keys operations +are used by the application, and how. There are three different +cases:

+
    +
  1. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys, are not used. Keys are accessed independently (even if +accessed via transactions or Lua scripts grouping multiple commands, +about the same key, together).
  2. +
  3. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys are used but only with keys having the same hash +tag, which means that the keys used together all have a +{...} sub-string that happens to be identical. For example +the following multiple keys operation is defined in the context of the +same hash tag: SUNION {user:1000}.foo {user:1000}.bar.
  4. +
  5. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys are used with key names not having an explicit, or the +same, hash tag.
  6. +
+

The third case is not handled by Valkey Cluster: the application +requires to be modified in order to not use multi keys operations or +only use them in the context of the same hash tag.

+

Case 1 and 2 are covered, so we’ll focus on those two cases, that are +handled in the same way, so no distinction will be made in the +documentation.

+

Assuming you have your preexisting data set split into N primaries, +where N=1 if you have no preexisting sharding, the following steps are +needed in order to migrate your data set to Valkey Cluster:

+
    +
  1. Stop your clients. No automatic live-migration to Valkey Cluster is +currently possible. You may be able to do it orchestrating a live +migration in the context of your application / environment.
  2. +
  3. Generate an append only file for all of your N primaries using the +BGREWRITEAOF command, and waiting for the AOF file to be +completely generated.
  4. +
  5. Save your AOF files from aof-1 to aof-N somewhere. At this point you +can stop your old instances if you wish (this is useful since in +non-virtualized deployments you often need to reuse the same +computers).
  6. +
  7. Create a Valkey Cluster composed of N primaries and zero replicas. +You’ll add replicas later. Make sure all your nodes are using the append +only file for persistence.
  8. +
  9. Stop all the cluster nodes, substitute their append only file with +your pre-existing append only files, aof-1 for the first node, aof-2 for +the second node, up to aof-N.
  10. +
  11. Restart your Valkey Cluster nodes with the new AOF files. They’ll +complain that there are keys that should not be there according to their +configuration.
  12. +
  13. Use valkey-cli --cluster fix command in order to fix +the cluster so that keys will be migrated according to the hash slots +each node is authoritative or not.
  14. +
  15. Use valkey-cli --cluster check at the end to make sure +your cluster is ok.
  16. +
  17. Restart your clients modified to use a Valkey Cluster aware client +library.
  18. +
+

There is an alternative way to import data from external instances to +a Valkey Cluster, which is to use the +valkey-cli --cluster import command.

+

The command moves all the keys of a running instance (deleting the +keys from the source instance) to the specified pre-existing Valkey +Cluster.

+

Note: If not for backward compatibility, the Valkey +project no longer uses the words “master” and “slave”. Unfortunately in +this command these words are part of the protocol, so we’ll be able to +remove such occurrences only when this API will be naturally +deprecated.

+

Learn more

+ + + diff --git a/_test/cluster-tutorial.html b/_test/cluster-tutorial.html new file mode 100644 index 000000000..4b48a41a4 --- /dev/null +++ b/_test/cluster-tutorial.html @@ -0,0 +1,1284 @@ + + + + + + + + Cluster tutorial + + + +
+

Cluster tutorial

+
+

Valkey scales horizontally with a deployment topology called Valkey +Cluster. This topic will teach you how to set up, test, and operate +Valkey Cluster in production. You will learn about the availability and +consistency characteristics of Valkey Cluster from the end user’s point +of view.

+

If you plan to run a production Valkey Cluster deployment or want to +understand better how Valkey Cluster works internally, consult the Valkey Cluster specification.

+

Valkey Cluster 101

+

Valkey Cluster provides a way to run a Valkey installation where data +is automatically sharded across multiple Valkey nodes. Valkey Cluster +also provides some degree of availability during partitions—in practical +terms, the ability to continue operations when some nodes fail or are +unable to communicate. However, the cluster will become unavailable in +the event of larger failures (for example, when the majority of +primaries are unavailable).

+

So, with Valkey Cluster, you get the ability to:

+
    +
  • Automatically split your dataset among multiple nodes.
  • +
  • Continue operations when a subset of the nodes are experiencing +failures or are unable to communicate with the rest of the cluster.
  • +
+

Valkey Cluster TCP ports

+

Every Valkey Cluster node requires two open TCP connections: a Valkey +TCP port used to serve clients, e.g., 6379, and second port known as the +cluster bus port. By default, the cluster bus port is set by +adding 10000 to the data port (e.g., 16379); however, you can override +this in the cluster-port configuration.

+

Cluster bus is a node-to-node communication channel that uses a +binary protocol, which is more suited to exchanging information between +nodes due to little bandwidth and processing time. Nodes use the cluster +bus for failure detection, configuration updates, failover +authorization, and so forth. Clients should never try to communicate +with the cluster bus port, but rather use the Valkey command port. +However, make sure you open both ports in your firewall, otherwise +Valkey cluster nodes won’t be able to communicate.

+

For a Valkey Cluster to work properly you need, for each node:

+
    +
  1. The client communication port (usually 6379) used to communicate +with clients and be open to all the clients that need to reach the +cluster, plus all the other cluster nodes that use the client port for +key migrations.
  2. +
  3. The cluster bus port must be reachable from all the other cluster +nodes.
  4. +
+

If you don’t open both TCP ports, your cluster will not work as +expected.

+

Valkey Cluster and Docker

+

Currently, Valkey Cluster does not support NATted environments and in +general environments where IP addresses or TCP ports are remapped.

+

Docker uses a technique called port mapping: programs +running inside Docker containers may be exposed with a different port +compared to the one the program believes to be using. This is useful for +running multiple containers using the same ports, at the same time, in +the same server.

+

To make Docker compatible with Valkey Cluster, you need to use +Docker’s host networking mode. Please see the +--net=host option in the Docker +documentation for more information.

+

Valkey Cluster data sharding

+

Valkey Cluster does not use consistent hashing, but a different form +of sharding where every key is conceptually part of what we call a +hash slot.

+

There are 16384 hash slots in Valkey Cluster, and to compute the hash +slot for a given key, we simply take the CRC16 of the key modulo +16384.

+

Every node in a Valkey Cluster is responsible for a subset of the +hash slots, so, for example, you may have a cluster with 3 nodes, +where:

+
    +
  • Node A contains hash slots from 0 to 5500.
  • +
  • Node B contains hash slots from 5501 to 11000.
  • +
  • Node C contains hash slots from 11001 to 16383.
  • +
+

This makes it easy to add and remove cluster nodes. For example, if I +want to add a new node D, I need to move some hash slots from nodes A, +B, C to D. Similarly, if I want to remove node A from the cluster, I can +just move the hash slots served by A to B and C. Once node A is empty, I +can remove it from the cluster completely.

+

Moving hash slots from a node to another does not require stopping +any operations; therefore, adding and removing nodes, or changing the +percentage of hash slots held by a node, requires no downtime.

+

Valkey Cluster supports multiple key operations as long as all of the +keys involved in a single command execution (or whole transaction, or +Lua script execution) belong to the same hash slot. The user can force +multiple keys to be part of the same hash slot by using a feature called +hash tags.

+

Hash tags are documented in the Valkey Cluster specification, but the +gist is that if there is a substring between {} brackets in a key, only +what is inside the string is hashed. For example, the keys +user:{123}:profile and user:{123}:account are +guaranteed to be in the same hash slot because they share the same hash +tag. As a result, you can operate on these two keys in the same +multi-key operation.

+

Valkey Cluster +primary-replica model

+

To remain available when a subset of primary nodes are failing or are +not able to communicate with the majority of nodes, Valkey Cluster uses +a primary-replica model where every hash slot has from 1 (the primary +itself) to N replicas (N-1 additional replica nodes).

+

In our example cluster with nodes A, B, C, if node B fails the +cluster is not able to continue, since we no longer have a way to serve +hash slots in the range 5501-11000.

+

However, when the cluster is created (or at a later time), we add a +replica node to every primary, so that the final cluster is composed of +A, B, C that are primary nodes, and A1, B1, C1 that are replica nodes. +This way, the system can continue if node B fails.

+

Node B1 replicates B, and B fails, the cluster will promote node B1 +as the new primary and will continue to operate correctly.

+

However, note that if nodes B and B1 fail at the same time, Valkey +Cluster will not be able to continue to operate.

+

Valkey Cluster +consistency guarantees

+

Valkey Cluster does not guarantee strong +consistency. In practical terms this means that under certain +conditions it is possible that Valkey Cluster will lose writes that were +acknowledged by the system to the client.

+

The first reason why Valkey Cluster can lose writes is because it +uses asynchronous replication. This means that during writes the +following happens:

+
    +
  • Your client writes to the primary B.
  • +
  • The primary B replies OK to your client.
  • +
  • The primary B propagates the write to its replicas B1, B2 and +B3.
  • +
+

As you can see, B does not wait for an acknowledgement from B1, B2, +B3 before replying to the client, since this would be a prohibitive +latency penalty for Valkey, so if your client writes something, B +acknowledges the write, but crashes before being able to send the write +to its replicas, one of the replicas (that did not receive the write) +can be promoted to primary, losing the write forever.

+

This is very similar to what happens with most databases that are +configured to flush data to disk every second, so it is a scenario you +are already able to reason about because of past experiences with +traditional database systems not involving distributed systems. +Similarly you can improve consistency by forcing the database to flush +data to disk before replying to the client, but this usually results in +prohibitively low performance. That would be the equivalent of +synchronous replication in the case of Valkey Cluster.

+

Basically, there is a trade-off to be made between performance and +consistency.

+

Valkey Cluster has support for synchronous writes when absolutely +needed, implemented via the WAIT command. This makes losing +writes a lot less likely. However, note that Valkey Cluster does not +implement strong consistency even when synchronous replication is used: +it is always possible, under more complex failure scenarios, that a +replica that was not able to receive the write will be elected as +primary.

+

There is another notable scenario where Valkey Cluster will lose +writes, that happens during a network partition where a client is +isolated with a minority of instances including at least a primary.

+

Take as an example our 6 nodes cluster composed of A, B, C, A1, B1, +C1, with 3 primaries and 3 replicas. There is also a client, that we +will call Z1.

+

After a partition occurs, it is possible that in one side of the +partition we have A, C, A1, B1, C1, and in the other side we have B and +Z1.

+

Z1 is still able to write to B, which will accept its writes. If the +partition heals in a very short time, the cluster will continue +normally. However, if the partition lasts enough time for B1 to be +promoted to primary on the majority side of the partition, the writes +that Z1 has sent to B in the meantime will be lost.

+

Note: There is a maximum window to +the amount of writes Z1 will be able to send to B: if enough time has +elapsed for the majority side of the partition to elect a replica as +primary, every primary node in the minority side will have stopped +accepting writes.

+

This amount of time is a very important configuration directive of +Valkey Cluster, and is called the node timeout.

+

After node timeout has elapsed, a primary node is considered to be +failing, and can be replaced by one of its replicas. Similarly, after +node timeout has elapsed without a primary node to be able to sense the +majority of the other primary nodes, it enters an error state and stops +accepting writes.

+

Valkey Cluster +configuration parameters

+

We are about to create an example cluster deployment. Before we +continue, let’s introduce the configuration parameters that Valkey +Cluster introduces in the valkey.conf file.

+
    +
  • cluster-enabled <yes/no>: If +yes, enables Valkey Cluster support in a specific Valkey instance. +Otherwise the instance starts as a standalone instance as usual.
  • +
  • cluster-config-file <filename>: +Note that despite the name of this option, this is not a user editable +configuration file, but the file where a Valkey Cluster node +automatically persists the cluster configuration (the state, basically) +every time there is a change, in order to be able to re-read it at +startup. The file lists things like the other nodes in the cluster, +their state, persistent variables, and so forth. Often this file is +rewritten and flushed on disk as a result of some message +reception.
  • +
  • cluster-node-timeout +<milliseconds>: The maximum amount of time a +Valkey Cluster node can be unavailable, without it being considered as +failing. If a primary node is not reachable for more than the specified +amount of time, it will be failed over by its replicas. This parameter +controls other important things in Valkey Cluster. Notably, every node +that can’t reach the majority of primary nodes for the specified amount +of time, will stop accepting queries.
  • +
  • cluster-replica-validity-factor +<factor>: If set to zero, a replica will +always consider itself valid, and will therefore always try to failover +a primary, regardless of the amount of time the link between the primary +and the replica remained disconnected. If the value is positive, a +maximum disconnection time is calculated as the node timeout +value multiplied by the factor provided with this option, and if the +node is a replica, it will not try to start a failover if the primary +link was disconnected for more than the specified amount of time. For +example, if the node timeout is set to 5 seconds and the validity factor +is set to 10, a replica disconnected from the primary for more than 50 +seconds will not try to failover its primary. Note that any value +different than zero may result in Valkey Cluster being unavailable after +a primary failure if there is no replica that is able to failover it. In +that case the cluster will return to being available only when the +original primary rejoins the cluster.
  • +
  • cluster-migration-barrier +<count>: Minimum number of replicas a +primary will remain connected with, for another replica to migrate to a +primary which is no longer covered by any replica. See the appropriate +section about replica migration in this tutorial for more +information.
  • +
  • cluster-require-full-coverage +<yes/no>: If this is set to yes, as it is by +default, the cluster stops accepting writes if some percentage of the +key space is not covered by any node. If the option is set to no, the +cluster will still serve queries even if only requests about a subset of +keys can be processed.
  • +
  • cluster-allow-reads-when-down +<yes/no>: If this is set to no, as it is by +default, a node in a Valkey Cluster will stop serving all traffic when +the cluster is marked as failed, either when a node can’t reach a quorum +of primaries or when full coverage is not met. This prevents reading +potentially inconsistent data from a node that is unaware of changes in +the cluster. This option can be set to yes to allow reads from a node +during the fail state, which is useful for applications that want to +prioritize read availability but still want to prevent inconsistent +writes. It can also be used for when using Valkey Cluster with only one +or two shards, as it allows the nodes to continue serving writes when a +primary fails but automatic failover is impossible.
  • +
+

Create and use a Valkey +Cluster

+

To create and use a Valkey Cluster, follow these steps:

+ +

But, first, familiarize yourself with the requirements for creating a +cluster.

+

Requirements to create +a Valkey Cluster

+

To create a cluster, the first thing you need is to have a few empty +Valkey instances running in cluster mode.

+

At minimum, set the following directives in the +valkey.conf file:

+
port 7000
+cluster-enabled yes
+cluster-config-file nodes.conf
+cluster-node-timeout 5000
+appendonly yes
+

To enable cluster mode, set the cluster-enabled +directive to yes. Every instance also contains the path of +a file where the configuration for this node is stored, which by default +is nodes.conf. This file is never touched by humans; it is +simply generated at startup by the Valkey Cluster instances, and updated +every time it is needed.

+

Note that the minimal cluster that works as expected +must contain at least three primary nodes. For deployment, we strongly +recommend a six-node cluster, with three primaries and three +replicas.

+

You can test this locally by creating the following directories named +after the port number of the instance you’ll run inside any given +directory.

+

For example:

+
mkdir cluster-test
+cd cluster-test
+mkdir 7000 7001 7002 7003 7004 7005
+

Create a valkey.conf file inside each of the +directories, from 7000 to 7005. As a template for your configuration +file just use the small example above, but make sure to replace the port +number 7000 with the right port number according to the +directory name.

+

You can start each instance as follows, each running in a separate +terminal tab:

+
cd 7000
+valkey-server ./valkey.conf
+

You’ll see from the logs that every node assigns itself a new ID:

+
[82462] 26 Nov 11:56:55.329 * No cluster configuration found, I'm 97a3a64667477371c4479320d683e4c8db5858b1
+

This ID will be used forever by this specific instance in order for +the instance to have a unique name in the context of the cluster. Every +node remembers every other node using this IDs, and not by IP or port. +IP addresses and ports may change, but the unique node identifier will +never change for all the life of the node. We call this identifier +simply Node ID.

+

Create a Valkey Cluster

+

Now that we have a number of instances running, you need to create +your cluster by writing some meaningful configuration to the nodes.

+

You can configure and execute individual instances manually or use +the create-cluster script. Let’s go over how you do it manually.

+

To create the cluster, run:

+
valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
+127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
+--cluster-replicas 1
+

The command used here is create, since we want to +create a new cluster. The option --cluster-replicas 1 means +that we want a replica for every primary created.

+

The other arguments are the list of addresses of the instances I want +to use to create the new cluster.

+

valkey-cli will propose a configuration. Accept the +proposed configuration by typing yes. The cluster will +be configured and joined, which means that instances will be +bootstrapped into talking with each other. Finally, if everything has +gone well, you’ll see a message like this:

+
[OK] All 16384 slots covered
+

This means that there is at least one primary instance serving each +of the 16384 available slots.

+

If you don’t want to create a Valkey Cluster by configuring and +executing individual instances manually as explained above, there is a +much simpler system (but you’ll not learn the same amount of operational +details).

+

Find the utils/create-cluster directory in the Valkey +distribution. There is a script called create-cluster +inside (same name as the directory it is contained into), it’s a simple +bash script. In order to start a 6 nodes cluster with 3 primaries and 3 +replicas just type the following commands:

+
    +
  1. create-cluster start
  2. +
  3. create-cluster create
  4. +
+

Reply to yes in step 2 when the valkey-cli +utility wants you to accept the cluster layout.

+

You can now interact with the cluster, the first node will start at +port 30001 by default. When you are done, stop the cluster with:

+
    +
  1. create-cluster stop
  2. +
+

Please read the README inside this directory for more +information on how to run the script.

+

Interact with the cluster

+

To connect to Valkey Cluster, you’ll need a cluster-aware Valkey +client. See the documentation for your client of +choice to determine its cluster support.

+

You can also test your Valkey Cluster using the +valkey-cli command line utility:

+
$ valkey-cli -c -p 7000
+127.0.0.1:7000> set foo bar
+-> Redirected to slot [12182] located at 127.0.0.1:7002
+OK
+127.0.0.1:7002> set hello world
+-> Redirected to slot [866] located at 127.0.0.1:7000
+OK
+127.0.0.1:7000> get foo
+-> Redirected to slot [12182] located at 127.0.0.1:7002
+"bar"
+127.0.0.1:7002> get hello
+-> Redirected to slot [866] located at 127.0.0.1:7000
+"world"
+

Note: If you created the cluster using the script, +your nodes may listen on different ports, starting from 30001 by +default.

+

The valkey-cli cluster support is very basic, so it +always uses the fact that Valkey Cluster nodes are able to redirect a +client to the right node. A serious client is able to do better than +that, and cache the map between hash slots and nodes addresses, to +directly use the right connection to the right node. The map is +refreshed only when something changed in the cluster configuration, for +example after a failover or after the system administrator changed the +cluster layout by adding or removing nodes.

+

Write an example app +with Valkey GLIDE

+

Before going forward showing how to operate the Valkey Cluster, doing +things like a failover, or a resharding, we need to create some example +application or at least to be able to understand the semantics of a +simple Valkey Cluster client interaction.

+

In this way we can run an example and at the same time try to make +nodes failing, or start a resharding, to see how Valkey Cluster behaves +under real world conditions. It is not very helpful to see what happens +while nobody is writing to the cluster.

+

This section explains some basic usage of Valkey +GLIDE for Node.js, the official Valkey client library, showing a +simple example application.

+

The following example demonstrates how to connect to a Valkey cluster +and perform basic operations. First, install the Valkey GLIDE +client:

+
npm install @valkey/valkey-glide
+

Here’s the example code:

+
import { GlideClusterClient } from "@valkey/valkey-glide";
+
+async function runExample() {
+    const addresses = [
+        {
+            host: "localhost",
+            port: 6379,
+        },
+    ];
+    // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options.
+    const client = await GlideClusterClient.createClient({
+        addresses: addresses,
+        // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently.
+        // useTLS: true,
+        // It is recommended to set a timeout for your specific use case
+        requestTimeout: 500, // 500ms timeout
+        clientName: "test_cluster_client",
+    });
+
+    try {
+
+        console.log("Connected to Valkey cluster");
+
+        // Get the last counter value, or start from 0
+        let last = await client.get("__last__");
+        last = last ? parseInt(last) : 0;
+
+        console.log(`Starting from counter: ${last}`);
+
+        // Write keys in batches using mset for better performance
+        const batchSize = 100;
+        for (let start = last + 1; start <= 1000000000; start += batchSize) {
+            try {
+                const keyValuePairs = [];
+                const end = Math.min(start + batchSize - 1, 1000000000);
+                
+                // Prepare batch of key-value pairs as array
+                for (let x = start; x <= end; x++) {
+                    keyValuePairs.push(`foo${x}`, x.toString());
+                }
+                
+                // Execute batch mset with array format
+                await client.mset(keyValuePairs);
+                
+                // Update counter and display progress
+                await client.set("__last__", end.toString());
+                console.log(`Batch completed: ${start} to ${end}`);
+                
+                // Verify a sample key from the batch
+                const sampleKey = `foo${start}`;
+                const value = await client.get(sampleKey);
+                console.log(`Sample verification - ${sampleKey}: ${value}`);
+                
+            } catch (error) {
+                console.log(`Error in batch starting at ${start}: ${error.message}`);
+            }
+        }
+    } catch (error) {
+        console.log(`Connection error: ${error.message}`);
+    } finally {
+        client.close();
+    }
+}
+
+runExample().catch(console.error);
+

The application does a very simple thing, it sets keys in the form +foo<number> to number, using batched +MSET operations for better performance. The MSET command accepts an +array of alternating keys and values. So if you run the program the +result is batches of MSET commands:

+
    +
  • MSET foo1 1 foo2 2 foo3 3 … foo100 100 (batch of 100 keys)
  • +
  • MSET foo101 101 foo102 102 … foo200 200 (next batch)
  • +
  • And so forth…
  • +
+

The program includes comprehensive error handling to display errors +instead of crashing, so all cluster operations are wrapped in try-catch +blocks.

+

The client creation section is the first key part of +the program. It creates the Valkey cluster client using a list of +cluster addresses and configuration options including a request +timeout and client name.

+

The addresses don’t need to be all the nodes of the cluster. The +important thing is that at least one node is reachable. Valkey GLIDE +automatically discovers the complete cluster topology once it connects +to any node.

+

Now that we have the cluster client instance, we can use it like any +other Valkey client to perform operations across the cluster.

+

The counter initialization section reads a counter +so that when we restart the example we don’t start again with +foo0, but continue from where we left off. The counter is +stored in Valkey itself using the key __last__.

+

The main processing loop sets keys in batches using +MSET operations for better performance, processing 100 keys at a time +and displaying progress or any errors that occur. you’ll get the usually +10k ops/second in the best of the conditions).

+

you’ll get optimal performance).

+

Starting the application produces the following output:

+
node example.js
+Connected to Valkey cluster
+Starting from counter: 0
+Batch completed: 1 to 100
+Sample verification - foo1: 1
+Batch completed: 101 to 200
+Sample verification - foo101: 101
+Batch completed: 201 to 300
+Sample verification - foo201: 201
+^C (I stopped the program here)
+

This is not a very interesting program and we’ll use a better one in +a moment but we can already see what happens during a resharding when +the program is running.

+

Reshard the cluster

+

Now we are ready to try a cluster resharding. To do this, please keep +the example.js program running, so that you can see if there is some +impact on the program running.

+

Resharding basically means to move hash slots from a set of nodes to +another set of nodes. Like cluster creation, it is accomplished using +the valkey-cli utility.

+

To start a resharding, just type:

+
valkey-cli --cluster reshard 127.0.0.1:7000
+

You only need to specify a single node, valkey-cli will find the +other nodes automatically.

+

Currently valkey-cli is only able to reshard with the administrator +support, you can’t just say move 5% of slots from this node to the other +one (but this is pretty trivial to implement). So it starts with +questions. The first is how much of a resharding do you want to do:

+
How many slots do you want to move (from 1 to 16384)?
+

We can try to reshard 1000 hash slots, that should already contain a +non trivial amount of keys if the example is still running without the +sleep call.

+

Then valkey-cli needs to know what is the target of the resharding, +that is, the node that will receive the hash slots. I’ll use the first +primary node, that is, 127.0.0.1:7000, but I need to specify the Node ID +of the instance. This was already printed in a list by valkey-cli, but I +can always find the ID of a node with the following command if I +need:

+
$ valkey-cli -p 7000 cluster nodes | grep myself
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5460
+

Ok so my target node is 97a3a64667477371c4479320d683e4c8db5858b1.

+

Now you’ll get asked from what nodes you want to take those keys. +I’ll just type all in order to take a bit of hash slots +from all the other primary nodes.

+

After the final confirmation you’ll see a message for every slot that +valkey-cli is going to move from a node to another, and a dot will be +printed for every actual key moved from one side to the other.

+

While the resharding is in progress you should be able to see your +example program running unaffected. You can stop and restart it multiple +times during the resharding if you want.

+

At the end of the resharding, you can test the health of the cluster +with the following command:

+
valkey-cli --cluster check 127.0.0.1:7000
+

All the slots will be covered as usual, but this time the primary at +127.0.0.1:7000 will have more hash slots, something around 6461.

+

Resharding can be performed automatically without the need to +manually enter the parameters in an interactive way. This is possible +using a command line like the following:

+
valkey-cli --cluster reshard <host>:<port> --cluster-from <node-id> --cluster-to <node-id> --cluster-slots <number of slots> --cluster-yes
+

This allows to build some automatism if you are likely to reshard +often, however currently there is no way for valkey-cli to +automatically rebalance the cluster checking the distribution of keys +across the cluster nodes and intelligently moving slots as needed. This +feature will be added in the future.

+

The --cluster-yes option instructs the cluster manager +to automatically answer “yes” to the command’s prompts, allowing it to +run in a non-interactive mode. Note that this option can also be +activated by setting the REDISCLI_CLUSTER_YES environment +variable.

+

A more interesting +example application

+

The example application we wrote early is not very good. It writes to +the cluster in a simple way without even checking if what was written is +the right thing.

+

From our point of view the cluster receiving the writes could just +always write the key foo to 42 to every +operation, and we would not notice at all.

+

Now we can write a more interesting application for testing cluster +behavior. A simple consistency checking application that uses a set of +counters, by default 1000, and sends INCR commands to +increment the counters.

+

However instead of just writing, the application does two additional +things:

+
    +
  • When a counter is updated using INCR, the application +remembers the write.
  • +
  • It also reads a random counter before every write, and check if the +value is what we expected it to be, comparing it with the value it has +in memory.
  • +
+

What this means is that this application is a simple +consistency checker, and is able to tell you if the +cluster lost some write, or if it accepted a write that we did not +receive acknowledgment for. In the first case we’ll see a counter having +a value that is smaller than the one we remember, while in the second +case the value will be greater.

+

Running a consistency testing application produces a line of output +every second:

+
node consistency-test.js
+925 R (0 err) | 925 W (0 err) |
+5030 R (0 err) | 5030 W (0 err) |
+9261 R (0 err) | 9261 W (0 err) |
+13517 R (0 err) | 13517 W (0 err) |
+17780 R (0 err) | 17780 W (0 err) |
+22025 R (0 err) | 22025 W (0 err) |
+25818 R (0 err) | 25818 W (0 err) |
+

The line shows the number of Reads and +Writes performed, and the number of errors (query not +accepted because of errors since the system was not available).

+

If some inconsistency is found, new lines are added to the output. +This is what happens, for example, if I reset a counter manually while +the program is running:

+
$ valkey-cli -h 127.0.0.1 -p 7000 set key_217 0
+OK
+
+(in the other tab I see...)
+
+94774 R (0 err) | 94774 W (0 err) |
+98821 R (0 err) | 98821 W (0 err) |
+102886 R (0 err) | 102886 W (0 err) | 114 lost |
+107046 R (0 err) | 107046 W (0 err) | 114 lost |
+

When I set the counter to 0 the real value was 114, so the program +reports 114 lost writes (INCR commands that are not +remembered by the cluster).

+

This program is much more interesting as a test case, so we’ll use it +to test the Valkey Cluster failover.

+

Test the failover

+

To trigger the failover, the simplest thing we can do (that is also +the semantically simplest failure that can occur in a distributed +system) is to crash a single process, in our case a single primary.

+

Note: During this test, you should take a tab open +with the consistency test application running.

+

We can identify a primary and crash it with the following +command:

+
$ valkey-cli -p 7000 cluster nodes | grep master
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385482984082 0 connected 5960-10921
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 master - 0 1385482983582 0 connected 11423-16383
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
+

Ok, so 7000, 7001, and 7002 are primaries. Let’s crash node 7002 with +the DEBUG SEGFAULT command:

+
$ valkey-cli -p 7002 debug segfault
+Error: Server closed the connection
+

Now we can look at the output of the consistency test to see what it +reported.

+
18849 R (0 err) | 18849 W (0 err) |
+23151 R (0 err) | 23151 W (0 err) |
+27302 R (0 err) | 27302 W (0 err) |
+
+... many error warnings here ...
+
+29659 R (578 err) | 29660 W (577 err) |
+33749 R (578 err) | 33750 W (577 err) |
+37918 R (578 err) | 37919 W (577 err) |
+42077 R (578 err) | 42078 W (577 err) |
+

As you can see during the failover the system was not able to accept +578 reads and 577 writes, however no inconsistency was created in the +database. This may sound unexpected as in the first part of this +tutorial we stated that Valkey Cluster can lose writes during the +failover because it uses asynchronous replication. What we did not say +is that this is not very likely to happen because Valkey sends the reply +to the client, and the commands to replicate to the replicas, about at +the same time, so there is a very small window to lose data. However the +fact that it is hard to trigger does not mean that it is impossible, so +this does not change the consistency guarantees provided by Valkey +cluster.

+

We can now check what is the cluster setup after the failover (note +that in the meantime I restarted the crashed instance so that it rejoins +the cluster as a replica):

+
$ valkey-cli -p 7000 cluster nodes
+3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385503418521 0 connected
+a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385503419023 0 connected
+97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
+3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385503419023 3 connected 11423-16383
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385503417005 0 connected 5960-10921
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385503418016 3 connected
+

Now the primaries are running on ports 7000, 7001 and 7005. What was +previously a primary, that is the Valkey instance running on port 7002, +is now a replica of 7005.

+

The output of the CLUSTER NODES command may look +intimidating, but it is actually pretty simple, and is composed of the +following tokens:

+
    +
  • Node ID
  • +
  • ip:port
  • +
  • flags: master, replica, myself, fail, …
  • +
  • if it is a replica, the Node ID of the master
  • +
  • Time of the last pending PING still waiting for a reply.
  • +
  • Time of the last PONG received.
  • +
  • Configuration epoch for this node (see the Cluster +specification).
  • +
  • Status of the link to this node.
  • +
  • Slots served…
  • +
+

Manual failover

+

Sometimes it is useful to force a failover without actually causing +any problem on a primary. For example, to upgrade the Valkey process of +one of the primary nodes it is a good idea to failover it to turn it +into a replica with minimal impact on availability.

+

Manual failovers are supported by Valkey Cluster using the +CLUSTER FAILOVER command, that must be executed in one of +the replicas of the primary you want to failover.

+

Manual failovers are special and are safer compared to failovers +resulting from actual primary failures. They occur in a way that avoids +data loss in the process, by switching clients from the original primary +to the new primary only when the system is sure that the new primary +processed all the replication stream from the old one.

+

This is what you see in the replica log when you perform a manual +failover:

+
# Manual failover user request accepted.
+# Received replication offset for paused primary manual failover: 347540
+# All primary replication stream processed, manual failover can start.
+# Start of election delayed for 0 milliseconds (rank #0, offset 347540).
+# Starting a failover election for epoch 7545.
+# Failover election won: I'm the new primary.
+

Clients sending write commands to the primary are blocked during the +failover. When the primary sends its replication offset to the replica, +the replica waits to reach the offset on its side. When the replication +offset is reached, the failover starts, and the old primary is informed +about the configuration switch. When the switch is complete, the clients +are unblocked on the old primary and they are redirected to the new +primary.

+

Note: To promote a replica to primary, it must first +be known as a replica by a majority of the primaries in the cluster. +Otherwise, it cannot win the failover election. If the replica has just +been added to the cluster (see Add a new node as a replica), +you may need to wait a while before sending the +CLUSTER FAILOVER command, to make sure the primaries in +cluster are aware of the new replica.

+

Add a new node

+

Adding a new node is basically the process of adding an empty node +and then moving some data into it, in case it is a new primary, or +telling it to setup as a replica of a known node, in case it is a +replica.

+

We’ll show both, starting with the addition of a new primary +instance.

+

In both cases the first step to perform is adding an empty +node.

+

This is as simple as to start a new node in port 7006 (we already +used from 7000 to 7005 for our existing 6 nodes) with the same +configuration used for the other nodes, except for the port number, so +what you should do in order to conform with the setup we used for the +previous nodes:

+
    +
  • Create a new tab in your terminal application.
  • +
  • Enter the cluster-test directory.
  • +
  • Create a directory named 7006.
  • +
  • Create a valkey.conf file inside, similar to the one used for the +other nodes but using 7006 as port number.
  • +
  • Finally start the server with +../valkey-server ./valkey.conf
  • +
+

At this point the server should be running.

+

Now we can use valkey-cli as usual in order to add +the node to the existing cluster.

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
+

As you can see I used the add-node command +specifying the address of the new node as first argument, and the +address of a random existing node in the cluster as second argument.

+

In practical terms valkey-cli here did very little to help us, it +just sent a CLUSTER MEET message to the node, something +that is also possible to accomplish manually. However valkey-cli also +checks the state of the cluster before to operate, so it is a good idea +to perform cluster operations always via valkey-cli even when you know +how the internals work.

+

Now we can connect to the new node to see if it really joined the +cluster:

+
valkey 127.0.0.1:7006> cluster nodes
+3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385543178575 0 connected 5960-10921
+3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385543179583 0 connected
+f093c80dde814da99c5cf72a7dd01590792b783b :0 myself,master - 0 0 0 connected
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543178072 3 connected
+a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385543178575 0 connected
+97a3a64667477371c4479320d683e4c8db5858b1 127.0.0.1:7000 master - 0 1385543179080 0 connected 0-5959 10922-11422
+3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385543177568 3 connected 11423-16383
+

Note that since this node is already connected to the cluster it is +already able to redirect client queries correctly and is generally +speaking part of the cluster. However it has two peculiarities compared +to the other primaries:

+
    +
  • It holds no data as it has no assigned hash slots.
  • +
  • Because it is a primary without assigned slots, it does not +participate in the election process when a replica wants to become a +primary.
  • +
+

Now it is possible to assign hash slots to this node using the +resharding feature of valkey-cli. It is basically useless +to show this as we already did in a previous section, there is no +difference, it is just a resharding having as a target the empty +node.

+
Add a new node as a replica
+

Adding a new replica can be performed in two ways. The obvious one is +to use valkey-cli again, but with the –cluster-replica option, like +this:

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica
+

Note that the command line here is exactly like the one we used to +add a new primary, so we are not specifying to which primary we want to +add the replica. In this case, what happens is that valkey-cli will add +the new node as replica of a random primary among the primaries with +fewer replicas.

+

However you can specify exactly what primary you want to target with +your new replica with the following command line:

+
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+

This way we assign the new replica to a specific primary.

+

A more manual way to add a replica to a specific primary is to add +the new node as an empty primary, and then turn it into a replica using +the CLUSTER REPLICATE command. This also works if the node +was added as a replica but you want to move it as a replica of a +different primary.

+

For example in order to add a replica for the node 127.0.0.1:7005 +that is currently serving hash slots in the range 11423-16383, that has +a Node ID 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e, all I need to do is +to connect with the new node (already added as empty primary) and send +the command:

+
valkey 127.0.0.1:7006> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+

That’s it. Now we have a new replica for this set of hash slots, and +all the other nodes in the cluster already know (after a few seconds +needed to update their config). We can verify with the following +command:

+
$ valkey-cli -p 7000 cluster nodes | grep slave | grep 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
+f093c80dde814da99c5cf72a7dd01590792b783b 127.0.0.1:7006 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617702 3 connected
+2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617198 3 connected
+

The node 3c3a0c… now has two replicas, running on ports 7002 (the +existing one) and 7006 (the new one).

+

Remove a node

+

To remove a replica node just use the del-node command +of valkey-cli:

+
valkey-cli --cluster del-node 127.0.0.1:7000 `<node-id>`
+

The first argument is just a random node in the cluster, the second +argument is the ID of the node you want to remove.

+

You can remove a primary node in the same way as well, +however in order to remove a primary node it must be +empty. If the primary is not empty you need to reshard data +away from it to all the other primary nodes before.

+

An alternative to remove a primary node is to perform a manual +failover of it over one of its replicas and remove the node after it +turned into a replica of the new primary. Obviously this does not help +when you want to reduce the actual number of primaries in your cluster, +in that case, a resharding is needed.

+

There is a special scenario where you want to remove a failed node. +You should not use the del-node command because it tries to +connect to all nodes and you will encounter a “connection refused” +error. Instead, you can use the call command:

+
valkey-cli --cluster call 127.0.0.1:7000 cluster forget `<node-id>`
+

This command will execute CLUSTER FORGET command on +every node.

+

Replica migration

+

In Valkey Cluster, you can reconfigure a replica to replicate with a +different primary at any time just using this command:

+
CLUSTER REPLICATE <master-node-id>
+

However there is a special scenario where you want replicas to move +from one primary to another one automatically, without the help of the +system administrator. The automatic reconfiguration of replicas is +called replicas migration and is able to improve the +reliability of a Valkey Cluster.

+

Note: You can read the details of replicas migration +in the Valkey Cluster Specification, here +we’ll only provide some information about the general idea and what you +should do in order to benefit from it.

+

The reason why you may want to let your cluster replicas to move from +one primary to another under certain condition, is that usually the +Valkey Cluster is as resistant to failures as the number of replicas +attached to a given primary.

+

For example a cluster where every primary has a single replica can’t +continue operations if the primary and its replica fail at the same +time, simply because there is no other instance to have a copy of the +hash slots the primary was serving. However while net-splits are likely +to isolate a number of nodes at the same time, many other kind of +failures, like hardware or software failures local to a single node, are +a very notable class of failures that are unlikely to happen at the same +time, so it is possible that in your cluster where every primary has a +replica, the replica is killed at 4am, and the primary is killed at 6am. +This still will result in a cluster that can no longer operate.

+

To improve reliability of the system we have the option to add +additional replicas to every primary, but this is expensive. Replica +migration allows to add more replicas to just a few primaries. So you +have 10 primaries with 1 replica each, for a total of 20 instances. +However you add, for example, 3 instances more as replicas of some of +your primaries, so certain primaries will have more than a single +replica.

+

With replicas migration what happens is that if a primary is left +without replicas, a replica from a primary that has multiple replicas +will migrate to the orphaned primary. So after your replica +goes down at 4am as in the example we made above, another replica will +take its place, and when the primary will fail as well at 5am, there is +still a replica that can be elected so that the cluster can continue to +operate.

+

So what you should know about replicas migration in short?

+
    +
  • The cluster will try to migrate a replica from the primary that has +the greatest number of replicas in a given moment.
  • +
  • To benefit from replica migration you have just to add a few more +replicas to a single primary in your cluster, it does not matter what +primary.
  • +
  • There is a configuration parameter that controls the replica +migration feature that is called cluster-migration-barrier: +you can read more about it in the example valkey.conf file +provided with Valkey Cluster.
  • +
+

Upgrade nodes in a Valkey +Cluster

+

Upgrading replica nodes is easy since you just need to stop the node +and restart it with an updated version of Valkey. If there are clients +scaling reads using replica nodes, they should be able to reconnect to a +different replica if a given one is not available.

+

Upgrading primaries is a bit more complex. The suggested procedure is +to trigger a manual failover to turn the old primary into a replica and +then upgrading it.

+

A complete rolling upgrade of all nodes in a cluster can be performed +by repeating the following procedure for each shard (a primary and its +replicas):

+
    +
  1. Add one or more upgraded nodes as new replicas to the primary. +This step is optional but it ensures that the number of replicas is not +compromised during the rolling upgrade. To add a new node, use CLUSTER MEET and +CLUSTER REPLICATE +or use valkey-cli as described under Add a new node as a replica.

    +

    An alternative is to upgrade one replica at a time and have fewer +replicas online during the upgrade.

  2. +
  3. Upgrade the old replicas you want to keep by restarting them with +the updated version of Valkey. If you’re replacing all the old nodes +with new nodes, you can skip this step.

  4. +
  5. Select one of the upgraded replicas to be the new primary. Wait +until this replica has caught up the replication offset with the +primary. You can use INFO REPLICATION and check +for the line master_link_status:up to be present. This +indicates that the initial sync with the primary is complete.

    +

    After the initial full sync, the replica might still lag behind in +replication. Send INFO REPLICATION to the primary and the +replica and compare the field master_repl_offset returned +by both nodes. If the offsets match, it means that all writes have been +replicated. However, if the primary receives a constant stream of +writes, it’s possible that the offsets will never be equal. In this +step, you can accept a small difference. It’s usually enough to wait for +some seconds to minimize the difference.

  6. +
  7. Check that the new replica is known by all nodes in the cluster, +or at least by the primaries in the cluster. You can send CLUSTER NODES to +each of the nodes in the cluster and check that they all are aware of +the new node. Wait for some time and repeat the check if +necessary.

  8. +
  9. Trigger a manual failover by sending CLUSTER FAILOVER +to the replica node selected to become the new primary. See the Manual failover section in this document for +more information.

  10. +
  11. Wait for the failover to complete. To check, you can use ROLE, INFO REPLICATION (which +indicates role:master after successful failover) or CLUSTER NODES to +verify that the state of the cluster has changed shortly after the +command was sent.

  12. +
  13. Take the old primary (now a replica) out of service, or upgrade +it and add it again as a replica. Remove additional replicas kept for +redundancy during the upgrade, if any.

  14. +
+

Repeat this sequence for each shard (each primary and its replicas) +until all nodes in the cluster have been upgraded.

+

Migrate to Valkey Cluster

+

Users willing to migrate to Valkey Cluster may have just a single +primary, or may already using a preexisting sharding setup, where keys +are split among N nodes, using some in-house algorithm or a sharding +algorithm implemented by their client library or Valkey proxy.

+

In both cases it is possible to migrate to Valkey Cluster easily, +however what is the most important detail is if multiple-keys operations +are used by the application, and how. There are three different +cases:

+
    +
  1. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys, are not used. Keys are accessed independently (even if +accessed via transactions or Lua scripts grouping multiple commands, +about the same key, together).
  2. +
  3. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys are used but only with keys having the same hash +tag, which means that the keys used together all have a +{...} sub-string that happens to be identical. For example +the following multiple keys operation is defined in the context of the +same hash tag: SUNION {user:1000}.foo {user:1000}.bar.
  4. +
  5. Multiple keys operations, or transactions, or Lua scripts involving +multiple keys are used with key names not having an explicit, or the +same, hash tag.
  6. +
+

The third case is not handled by Valkey Cluster: the application +requires to be modified in order to not use multi keys operations or +only use them in the context of the same hash tag.

+

Case 1 and 2 are covered, so we’ll focus on those two cases, that are +handled in the same way, so no distinction will be made in the +documentation.

+

Assuming you have your preexisting data set split into N primaries, +where N=1 if you have no preexisting sharding, the following steps are +needed in order to migrate your data set to Valkey Cluster:

+
    +
  1. Stop your clients. No automatic live-migration to Valkey Cluster is +currently possible. You may be able to do it orchestrating a live +migration in the context of your application / environment.
  2. +
  3. Generate an append only file for all of your N primaries using the +BGREWRITEAOF command, and waiting for the AOF file to be +completely generated.
  4. +
  5. Save your AOF files from aof-1 to aof-N somewhere. At this point you +can stop your old instances if you wish (this is useful since in +non-virtualized deployments you often need to reuse the same +computers).
  6. +
  7. Create a Valkey Cluster composed of N primaries and zero replicas. +You’ll add replicas later. Make sure all your nodes are using the append +only file for persistence.
  8. +
  9. Stop all the cluster nodes, substitute their append only file with +your pre-existing append only files, aof-1 for the first node, aof-2 for +the second node, up to aof-N.
  10. +
  11. Restart your Valkey Cluster nodes with the new AOF files. They’ll +complain that there are keys that should not be there according to their +configuration.
  12. +
  13. Use valkey-cli --cluster fix command in order to fix +the cluster so that keys will be migrated according to the hash slots +each node is authoritative or not.
  14. +
  15. Use valkey-cli --cluster check at the end to make sure +your cluster is ok.
  16. +
  17. Restart your clients modified to use a Valkey Cluster aware client +library.
  18. +
+

There is an alternative way to import data from external instances to +a Valkey Cluster, which is to use the +valkey-cli --cluster import command.

+

The command moves all the keys of a running instance (deleting the +keys from the source instance) to the specified pre-existing Valkey +Cluster.

+

Note: If not for backward compatibility, the Valkey +project no longer uses the words “master” and “slave”. Unfortunately in +this command these words are part of the protocol, so we’ll be able to +remove such occurrences only when this API will be naturally +deprecated.

+

Learn more

+ + + diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index ab0ee0400..19d5d78ab 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -379,24 +379,23 @@ Here's the example code: import { GlideClusterClient } from "@valkey/valkey-glide"; async function runExample() { - // Define cluster addresses - you only need one reachable node const addresses = [ - { host: "127.0.0.1", port: 7000 }, - { host: "127.0.0.1", port: 7001 } + { + host: "localhost", + port: 6379, + }, ]; + // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options. + const client = await GlideClusterClient.createClient({ + addresses: addresses, + // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently. + // useTLS: true, + // It is recommended to set a timeout for your specific use case + requestTimeout: 500, // 500ms timeout + clientName: "test_cluster_client", + }); - let client; - try { - // Create cluster client with configuration - client = await GlideClusterClient.createClient({ - addresses: addresses, - clientConfiguration: { - requestTimeout: 500, // 500ms timeout - clientName: "valkey_cluster_example" - } - }); - console.log("Connected to Valkey cluster"); // Get the last counter value, or start from 0 @@ -436,9 +435,7 @@ async function runExample() { } catch (error) { console.log(`Connection error: ${error.message}`); } finally { - if (client) { - client.close(); - } + client.close(); } } diff --git a/wordlist b/wordlist index ad9294643..a5fc4a0c0 100644 --- a/wordlist +++ b/wordlist @@ -280,7 +280,7 @@ first-args firstkey FlameGraph fmt -foo[0-9] +foo[0-9]+ formatter fp_error france_location From 8078f287a582cea7fc7dabf96c3bdcba3a6843c6 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 22:50:23 +0000 Subject: [PATCH 09/25] Remove HTML test files and update .gitignore Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- .gitignore | 1 + _test/cluster-tutorial-final.html | 1283 ---------------------------- _test/cluster-tutorial.html | 1284 ----------------------------- 3 files changed, 1 insertion(+), 2567 deletions(-) delete mode 100644 _test/cluster-tutorial-final.html delete mode 100644 _test/cluster-tutorial.html diff --git a/.gitignore b/.gitignore index a36a5b23d..60e28e75d 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ tmp .DS_Store _build __pycache__ +_test/*.html diff --git a/_test/cluster-tutorial-final.html b/_test/cluster-tutorial-final.html deleted file mode 100644 index 98e95dad7..000000000 --- a/_test/cluster-tutorial-final.html +++ /dev/null @@ -1,1283 +0,0 @@ - - - - - - - - Cluster tutorial - - - -
-

Cluster tutorial

-
-

Valkey scales horizontally with a deployment topology called Valkey -Cluster. This topic will teach you how to set up, test, and operate -Valkey Cluster in production. You will learn about the availability and -consistency characteristics of Valkey Cluster from the end user’s point -of view.

-

If you plan to run a production Valkey Cluster deployment or want to -understand better how Valkey Cluster works internally, consult the Valkey Cluster specification.

-

Valkey Cluster 101

-

Valkey Cluster provides a way to run a Valkey installation where data -is automatically sharded across multiple Valkey nodes. Valkey Cluster -also provides some degree of availability during partitions—in practical -terms, the ability to continue operations when some nodes fail or are -unable to communicate. However, the cluster will become unavailable in -the event of larger failures (for example, when the majority of -primaries are unavailable).

-

So, with Valkey Cluster, you get the ability to:

-
    -
  • Automatically split your dataset among multiple nodes.
  • -
  • Continue operations when a subset of the nodes are experiencing -failures or are unable to communicate with the rest of the cluster.
  • -
-

Valkey Cluster TCP ports

-

Every Valkey Cluster node requires two open TCP connections: a Valkey -TCP port used to serve clients, e.g., 6379, and second port known as the -cluster bus port. By default, the cluster bus port is set by -adding 10000 to the data port (e.g., 16379); however, you can override -this in the cluster-port configuration.

-

Cluster bus is a node-to-node communication channel that uses a -binary protocol, which is more suited to exchanging information between -nodes due to little bandwidth and processing time. Nodes use the cluster -bus for failure detection, configuration updates, failover -authorization, and so forth. Clients should never try to communicate -with the cluster bus port, but rather use the Valkey command port. -However, make sure you open both ports in your firewall, otherwise -Valkey cluster nodes won’t be able to communicate.

-

For a Valkey Cluster to work properly you need, for each node:

-
    -
  1. The client communication port (usually 6379) used to communicate -with clients and be open to all the clients that need to reach the -cluster, plus all the other cluster nodes that use the client port for -key migrations.
  2. -
  3. The cluster bus port must be reachable from all the other cluster -nodes.
  4. -
-

If you don’t open both TCP ports, your cluster will not work as -expected.

-

Valkey Cluster and Docker

-

Currently, Valkey Cluster does not support NATted environments and in -general environments where IP addresses or TCP ports are remapped.

-

Docker uses a technique called port mapping: programs -running inside Docker containers may be exposed with a different port -compared to the one the program believes to be using. This is useful for -running multiple containers using the same ports, at the same time, in -the same server.

-

To make Docker compatible with Valkey Cluster, you need to use -Docker’s host networking mode. Please see the ---net=host option in the Docker -documentation for more information.

-

Valkey Cluster data sharding

-

Valkey Cluster does not use consistent hashing, but a different form -of sharding where every key is conceptually part of what we call a -hash slot.

-

There are 16384 hash slots in Valkey Cluster, and to compute the hash -slot for a given key, we simply take the CRC16 of the key modulo -16384.

-

Every node in a Valkey Cluster is responsible for a subset of the -hash slots, so, for example, you may have a cluster with 3 nodes, -where:

-
    -
  • Node A contains hash slots from 0 to 5500.
  • -
  • Node B contains hash slots from 5501 to 11000.
  • -
  • Node C contains hash slots from 11001 to 16383.
  • -
-

This makes it easy to add and remove cluster nodes. For example, if I -want to add a new node D, I need to move some hash slots from nodes A, -B, C to D. Similarly, if I want to remove node A from the cluster, I can -just move the hash slots served by A to B and C. Once node A is empty, I -can remove it from the cluster completely.

-

Moving hash slots from a node to another does not require stopping -any operations; therefore, adding and removing nodes, or changing the -percentage of hash slots held by a node, requires no downtime.

-

Valkey Cluster supports multiple key operations as long as all of the -keys involved in a single command execution (or whole transaction, or -Lua script execution) belong to the same hash slot. The user can force -multiple keys to be part of the same hash slot by using a feature called -hash tags.

-

Hash tags are documented in the Valkey Cluster specification, but the -gist is that if there is a substring between {} brackets in a key, only -what is inside the string is hashed. For example, the keys -user:{123}:profile and user:{123}:account are -guaranteed to be in the same hash slot because they share the same hash -tag. As a result, you can operate on these two keys in the same -multi-key operation.

-

Valkey Cluster -primary-replica model

-

To remain available when a subset of primary nodes are failing or are -not able to communicate with the majority of nodes, Valkey Cluster uses -a primary-replica model where every hash slot has from 1 (the primary -itself) to N replicas (N-1 additional replica nodes).

-

In our example cluster with nodes A, B, C, if node B fails the -cluster is not able to continue, since we no longer have a way to serve -hash slots in the range 5501-11000.

-

However, when the cluster is created (or at a later time), we add a -replica node to every primary, so that the final cluster is composed of -A, B, C that are primary nodes, and A1, B1, C1 that are replica nodes. -This way, the system can continue if node B fails.

-

Node B1 replicates B, and B fails, the cluster will promote node B1 -as the new primary and will continue to operate correctly.

-

However, note that if nodes B and B1 fail at the same time, Valkey -Cluster will not be able to continue to operate.

-

Valkey Cluster -consistency guarantees

-

Valkey Cluster does not guarantee strong -consistency. In practical terms this means that under certain -conditions it is possible that Valkey Cluster will lose writes that were -acknowledged by the system to the client.

-

The first reason why Valkey Cluster can lose writes is because it -uses asynchronous replication. This means that during writes the -following happens:

-
    -
  • Your client writes to the primary B.
  • -
  • The primary B replies OK to your client.
  • -
  • The primary B propagates the write to its replicas B1, B2 and -B3.
  • -
-

As you can see, B does not wait for an acknowledgement from B1, B2, -B3 before replying to the client, since this would be a prohibitive -latency penalty for Valkey, so if your client writes something, B -acknowledges the write, but crashes before being able to send the write -to its replicas, one of the replicas (that did not receive the write) -can be promoted to primary, losing the write forever.

-

This is very similar to what happens with most databases that are -configured to flush data to disk every second, so it is a scenario you -are already able to reason about because of past experiences with -traditional database systems not involving distributed systems. -Similarly you can improve consistency by forcing the database to flush -data to disk before replying to the client, but this usually results in -prohibitively low performance. That would be the equivalent of -synchronous replication in the case of Valkey Cluster.

-

Basically, there is a trade-off to be made between performance and -consistency.

-

Valkey Cluster has support for synchronous writes when absolutely -needed, implemented via the WAIT command. This makes losing -writes a lot less likely. However, note that Valkey Cluster does not -implement strong consistency even when synchronous replication is used: -it is always possible, under more complex failure scenarios, that a -replica that was not able to receive the write will be elected as -primary.

-

There is another notable scenario where Valkey Cluster will lose -writes, that happens during a network partition where a client is -isolated with a minority of instances including at least a primary.

-

Take as an example our 6 nodes cluster composed of A, B, C, A1, B1, -C1, with 3 primaries and 3 replicas. There is also a client, that we -will call Z1.

-

After a partition occurs, it is possible that in one side of the -partition we have A, C, A1, B1, C1, and in the other side we have B and -Z1.

-

Z1 is still able to write to B, which will accept its writes. If the -partition heals in a very short time, the cluster will continue -normally. However, if the partition lasts enough time for B1 to be -promoted to primary on the majority side of the partition, the writes -that Z1 has sent to B in the meantime will be lost.

-

Note: There is a maximum window to -the amount of writes Z1 will be able to send to B: if enough time has -elapsed for the majority side of the partition to elect a replica as -primary, every primary node in the minority side will have stopped -accepting writes.

-

This amount of time is a very important configuration directive of -Valkey Cluster, and is called the node timeout.

-

After node timeout has elapsed, a primary node is considered to be -failing, and can be replaced by one of its replicas. Similarly, after -node timeout has elapsed without a primary node to be able to sense the -majority of the other primary nodes, it enters an error state and stops -accepting writes.

-

Valkey Cluster -configuration parameters

-

We are about to create an example cluster deployment. Before we -continue, let’s introduce the configuration parameters that Valkey -Cluster introduces in the valkey.conf file.

-
    -
  • cluster-enabled <yes/no>: If -yes, enables Valkey Cluster support in a specific Valkey instance. -Otherwise the instance starts as a standalone instance as usual.
  • -
  • cluster-config-file <filename>: -Note that despite the name of this option, this is not a user editable -configuration file, but the file where a Valkey Cluster node -automatically persists the cluster configuration (the state, basically) -every time there is a change, in order to be able to re-read it at -startup. The file lists things like the other nodes in the cluster, -their state, persistent variables, and so forth. Often this file is -rewritten and flushed on disk as a result of some message -reception.
  • -
  • cluster-node-timeout -<milliseconds>: The maximum amount of time a -Valkey Cluster node can be unavailable, without it being considered as -failing. If a primary node is not reachable for more than the specified -amount of time, it will be failed over by its replicas. This parameter -controls other important things in Valkey Cluster. Notably, every node -that can’t reach the majority of primary nodes for the specified amount -of time, will stop accepting queries.
  • -
  • cluster-replica-validity-factor -<factor>: If set to zero, a replica will -always consider itself valid, and will therefore always try to failover -a primary, regardless of the amount of time the link between the primary -and the replica remained disconnected. If the value is positive, a -maximum disconnection time is calculated as the node timeout -value multiplied by the factor provided with this option, and if the -node is a replica, it will not try to start a failover if the primary -link was disconnected for more than the specified amount of time. For -example, if the node timeout is set to 5 seconds and the validity factor -is set to 10, a replica disconnected from the primary for more than 50 -seconds will not try to failover its primary. Note that any value -different than zero may result in Valkey Cluster being unavailable after -a primary failure if there is no replica that is able to failover it. In -that case the cluster will return to being available only when the -original primary rejoins the cluster.
  • -
  • cluster-migration-barrier -<count>: Minimum number of replicas a -primary will remain connected with, for another replica to migrate to a -primary which is no longer covered by any replica. See the appropriate -section about replica migration in this tutorial for more -information.
  • -
  • cluster-require-full-coverage -<yes/no>: If this is set to yes, as it is by -default, the cluster stops accepting writes if some percentage of the -key space is not covered by any node. If the option is set to no, the -cluster will still serve queries even if only requests about a subset of -keys can be processed.
  • -
  • cluster-allow-reads-when-down -<yes/no>: If this is set to no, as it is by -default, a node in a Valkey Cluster will stop serving all traffic when -the cluster is marked as failed, either when a node can’t reach a quorum -of primaries or when full coverage is not met. This prevents reading -potentially inconsistent data from a node that is unaware of changes in -the cluster. This option can be set to yes to allow reads from a node -during the fail state, which is useful for applications that want to -prioritize read availability but still want to prevent inconsistent -writes. It can also be used for when using Valkey Cluster with only one -or two shards, as it allows the nodes to continue serving writes when a -primary fails but automatic failover is impossible.
  • -
-

Create and use a Valkey -Cluster

-

To create and use a Valkey Cluster, follow these steps:

- -

But, first, familiarize yourself with the requirements for creating a -cluster.

-

Requirements to create -a Valkey Cluster

-

To create a cluster, the first thing you need is to have a few empty -Valkey instances running in cluster mode.

-

At minimum, set the following directives in the -valkey.conf file:

-
port 7000
-cluster-enabled yes
-cluster-config-file nodes.conf
-cluster-node-timeout 5000
-appendonly yes
-

To enable cluster mode, set the cluster-enabled -directive to yes. Every instance also contains the path of -a file where the configuration for this node is stored, which by default -is nodes.conf. This file is never touched by humans; it is -simply generated at startup by the Valkey Cluster instances, and updated -every time it is needed.

-

Note that the minimal cluster that works as expected -must contain at least three primary nodes. For deployment, we strongly -recommend a six-node cluster, with three primaries and three -replicas.

-

You can test this locally by creating the following directories named -after the port number of the instance you’ll run inside any given -directory.

-

For example:

-
mkdir cluster-test
-cd cluster-test
-mkdir 7000 7001 7002 7003 7004 7005
-

Create a valkey.conf file inside each of the -directories, from 7000 to 7005. As a template for your configuration -file just use the small example above, but make sure to replace the port -number 7000 with the right port number according to the -directory name.

-

You can start each instance as follows, each running in a separate -terminal tab:

-
cd 7000
-valkey-server ./valkey.conf
-

You’ll see from the logs that every node assigns itself a new ID:

-
[82462] 26 Nov 11:56:55.329 * No cluster configuration found, I'm 97a3a64667477371c4479320d683e4c8db5858b1
-

This ID will be used forever by this specific instance in order for -the instance to have a unique name in the context of the cluster. Every -node remembers every other node using this IDs, and not by IP or port. -IP addresses and ports may change, but the unique node identifier will -never change for all the life of the node. We call this identifier -simply Node ID.

-

Create a Valkey Cluster

-

Now that we have a number of instances running, you need to create -your cluster by writing some meaningful configuration to the nodes.

-

You can configure and execute individual instances manually or use -the create-cluster script. Let’s go over how you do it manually.

-

To create the cluster, run:

-
valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
-127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
---cluster-replicas 1
-

The command used here is create, since we want to -create a new cluster. The option --cluster-replicas 1 means -that we want a replica for every primary created.

-

The other arguments are the list of addresses of the instances I want -to use to create the new cluster.

-

valkey-cli will propose a configuration. Accept the -proposed configuration by typing yes. The cluster will -be configured and joined, which means that instances will be -bootstrapped into talking with each other. Finally, if everything has -gone well, you’ll see a message like this:

-
[OK] All 16384 slots covered
-

This means that there is at least one primary instance serving each -of the 16384 available slots.

-

If you don’t want to create a Valkey Cluster by configuring and -executing individual instances manually as explained above, there is a -much simpler system (but you’ll not learn the same amount of operational -details).

-

Find the utils/create-cluster directory in the Valkey -distribution. There is a script called create-cluster -inside (same name as the directory it is contained into), it’s a simple -bash script. In order to start a 6 nodes cluster with 3 primaries and 3 -replicas just type the following commands:

-
    -
  1. create-cluster start
  2. -
  3. create-cluster create
  4. -
-

Reply to yes in step 2 when the valkey-cli -utility wants you to accept the cluster layout.

-

You can now interact with the cluster, the first node will start at -port 30001 by default. When you are done, stop the cluster with:

-
    -
  1. create-cluster stop
  2. -
-

Please read the README inside this directory for more -information on how to run the script.

-

Interact with the cluster

-

To connect to Valkey Cluster, you’ll need a cluster-aware Valkey -client. See the documentation for your client of -choice to determine its cluster support.

-

You can also test your Valkey Cluster using the -valkey-cli command line utility:

-
$ valkey-cli -c -p 7000
-127.0.0.1:7000> set foo bar
--> Redirected to slot [12182] located at 127.0.0.1:7002
-OK
-127.0.0.1:7002> set hello world
--> Redirected to slot [866] located at 127.0.0.1:7000
-OK
-127.0.0.1:7000> get foo
--> Redirected to slot [12182] located at 127.0.0.1:7002
-"bar"
-127.0.0.1:7002> get hello
--> Redirected to slot [866] located at 127.0.0.1:7000
-"world"
-

Note: If you created the cluster using the script, -your nodes may listen on different ports, starting from 30001 by -default.

-

The valkey-cli cluster support is very basic, so it -always uses the fact that Valkey Cluster nodes are able to redirect a -client to the right node. A serious client is able to do better than -that, and cache the map between hash slots and nodes addresses, to -directly use the right connection to the right node. The map is -refreshed only when something changed in the cluster configuration, for -example after a failover or after the system administrator changed the -cluster layout by adding or removing nodes.

-

Write an example app -with Valkey GLIDE

-

Before going forward showing how to operate the Valkey Cluster, doing -things like a failover, or a resharding, we need to create some example -application or at least to be able to understand the semantics of a -simple Valkey Cluster client interaction.

-

In this way we can run an example and at the same time try to make -nodes failing, or start a resharding, to see how Valkey Cluster behaves -under real world conditions. It is not very helpful to see what happens -while nobody is writing to the cluster.

-

This section explains some basic usage of Valkey -GLIDE for Node.js, the official Valkey client library, showing a -simple example application.

-

The following example demonstrates how to connect to a Valkey cluster -and perform basic operations. First, install the Valkey GLIDE -client:

-
npm install @valkey/valkey-glide
-

Here’s the example code:

-
import { GlideClusterClient } from "@valkey/valkey-glide";
-
-async function runExample() {
-    const addresses = [
-        {
-            host: "localhost",
-            port: 6379,
-        },
-    ];
-    // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options.
-    const client = await GlideClusterClient.createClient({
-        addresses: addresses,
-        // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently.
-        // useTLS: true,
-        // It is recommended to set a timeout for your specific use case
-        requestTimeout: 500, // 500ms timeout
-        clientName: "test_cluster_client",
-    });
-
-    try {
-        console.log("Connected to Valkey cluster");
-
-        // Get the last counter value, or start from 0
-        let last = await client.get("__last__");
-        last = last ? parseInt(last) : 0;
-
-        console.log(`Starting from counter: ${last}`);
-
-        // Write keys in batches using mset for better performance
-        const batchSize = 100;
-        for (let start = last + 1; start <= 1000000000; start += batchSize) {
-            try {
-                const keyValuePairs = [];
-                const end = Math.min(start + batchSize - 1, 1000000000);
-                
-                // Prepare batch of key-value pairs as array
-                for (let x = start; x <= end; x++) {
-                    keyValuePairs.push(`foo${x}`, x.toString());
-                }
-                
-                // Execute batch mset with array format
-                await client.mset(keyValuePairs);
-                
-                // Update counter and display progress
-                await client.set("__last__", end.toString());
-                console.log(`Batch completed: ${start} to ${end}`);
-                
-                // Verify a sample key from the batch
-                const sampleKey = `foo${start}`;
-                const value = await client.get(sampleKey);
-                console.log(`Sample verification - ${sampleKey}: ${value}`);
-                
-            } catch (error) {
-                console.log(`Error in batch starting at ${start}: ${error.message}`);
-            }
-        }
-    } catch (error) {
-        console.log(`Connection error: ${error.message}`);
-    } finally {
-        client.close();
-    }
-}
-
-runExample().catch(console.error);
-

The application does a very simple thing, it sets keys in the form -foo<number> to number, using batched -MSET operations for better performance. The MSET command accepts an -array of alternating keys and values. So if you run the program the -result is batches of MSET commands:

-
    -
  • MSET foo1 1 foo2 2 foo3 3 … foo100 100 (batch of 100 keys)
  • -
  • MSET foo101 101 foo102 102 … foo200 200 (next batch)
  • -
  • And so forth…
  • -
-

The program includes comprehensive error handling to display errors -instead of crashing, so all cluster operations are wrapped in try-catch -blocks.

-

The client creation section is the first key part of -the program. It creates the Valkey cluster client using a list of -cluster addresses and configuration options including a request -timeout and client name.

-

The addresses don’t need to be all the nodes of the cluster. The -important thing is that at least one node is reachable. Valkey GLIDE -automatically discovers the complete cluster topology once it connects -to any node.

-

Now that we have the cluster client instance, we can use it like any -other Valkey client to perform operations across the cluster.

-

The counter initialization section reads a counter -so that when we restart the example we don’t start again with -foo0, but continue from where we left off. The counter is -stored in Valkey itself using the key __last__.

-

The main processing loop sets keys in batches using -MSET operations for better performance, processing 100 keys at a time -and displaying progress or any errors that occur. you’ll get the usually -10k ops/second in the best of the conditions).

-

you’ll get optimal performance).

-

Starting the application produces the following output:

-
node example.js
-Connected to Valkey cluster
-Starting from counter: 0
-Batch completed: 1 to 100
-Sample verification - foo1: 1
-Batch completed: 101 to 200
-Sample verification - foo101: 101
-Batch completed: 201 to 300
-Sample verification - foo201: 201
-^C (I stopped the program here)
-

This is not a very interesting program and we’ll use a better one in -a moment but we can already see what happens during a resharding when -the program is running.

-

Reshard the cluster

-

Now we are ready to try a cluster resharding. To do this, please keep -the example.js program running, so that you can see if there is some -impact on the program running.

-

Resharding basically means to move hash slots from a set of nodes to -another set of nodes. Like cluster creation, it is accomplished using -the valkey-cli utility.

-

To start a resharding, just type:

-
valkey-cli --cluster reshard 127.0.0.1:7000
-

You only need to specify a single node, valkey-cli will find the -other nodes automatically.

-

Currently valkey-cli is only able to reshard with the administrator -support, you can’t just say move 5% of slots from this node to the other -one (but this is pretty trivial to implement). So it starts with -questions. The first is how much of a resharding do you want to do:

-
How many slots do you want to move (from 1 to 16384)?
-

We can try to reshard 1000 hash slots, that should already contain a -non trivial amount of keys if the example is still running without the -sleep call.

-

Then valkey-cli needs to know what is the target of the resharding, -that is, the node that will receive the hash slots. I’ll use the first -primary node, that is, 127.0.0.1:7000, but I need to specify the Node ID -of the instance. This was already printed in a list by valkey-cli, but I -can always find the ID of a node with the following command if I -need:

-
$ valkey-cli -p 7000 cluster nodes | grep myself
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5460
-

Ok so my target node is 97a3a64667477371c4479320d683e4c8db5858b1.

-

Now you’ll get asked from what nodes you want to take those keys. -I’ll just type all in order to take a bit of hash slots -from all the other primary nodes.

-

After the final confirmation you’ll see a message for every slot that -valkey-cli is going to move from a node to another, and a dot will be -printed for every actual key moved from one side to the other.

-

While the resharding is in progress you should be able to see your -example program running unaffected. You can stop and restart it multiple -times during the resharding if you want.

-

At the end of the resharding, you can test the health of the cluster -with the following command:

-
valkey-cli --cluster check 127.0.0.1:7000
-

All the slots will be covered as usual, but this time the primary at -127.0.0.1:7000 will have more hash slots, something around 6461.

-

Resharding can be performed automatically without the need to -manually enter the parameters in an interactive way. This is possible -using a command line like the following:

-
valkey-cli --cluster reshard <host>:<port> --cluster-from <node-id> --cluster-to <node-id> --cluster-slots <number of slots> --cluster-yes
-

This allows to build some automatism if you are likely to reshard -often, however currently there is no way for valkey-cli to -automatically rebalance the cluster checking the distribution of keys -across the cluster nodes and intelligently moving slots as needed. This -feature will be added in the future.

-

The --cluster-yes option instructs the cluster manager -to automatically answer “yes” to the command’s prompts, allowing it to -run in a non-interactive mode. Note that this option can also be -activated by setting the REDISCLI_CLUSTER_YES environment -variable.

-

A more interesting -example application

-

The example application we wrote early is not very good. It writes to -the cluster in a simple way without even checking if what was written is -the right thing.

-

From our point of view the cluster receiving the writes could just -always write the key foo to 42 to every -operation, and we would not notice at all.

-

Now we can write a more interesting application for testing cluster -behavior. A simple consistency checking application that uses a set of -counters, by default 1000, and sends INCR commands to -increment the counters.

-

However instead of just writing, the application does two additional -things:

-
    -
  • When a counter is updated using INCR, the application -remembers the write.
  • -
  • It also reads a random counter before every write, and check if the -value is what we expected it to be, comparing it with the value it has -in memory.
  • -
-

What this means is that this application is a simple -consistency checker, and is able to tell you if the -cluster lost some write, or if it accepted a write that we did not -receive acknowledgment for. In the first case we’ll see a counter having -a value that is smaller than the one we remember, while in the second -case the value will be greater.

-

Running a consistency testing application produces a line of output -every second:

-
node consistency-test.js
-925 R (0 err) | 925 W (0 err) |
-5030 R (0 err) | 5030 W (0 err) |
-9261 R (0 err) | 9261 W (0 err) |
-13517 R (0 err) | 13517 W (0 err) |
-17780 R (0 err) | 17780 W (0 err) |
-22025 R (0 err) | 22025 W (0 err) |
-25818 R (0 err) | 25818 W (0 err) |
-

The line shows the number of Reads and -Writes performed, and the number of errors (query not -accepted because of errors since the system was not available).

-

If some inconsistency is found, new lines are added to the output. -This is what happens, for example, if I reset a counter manually while -the program is running:

-
$ valkey-cli -h 127.0.0.1 -p 7000 set key_217 0
-OK
-
-(in the other tab I see...)
-
-94774 R (0 err) | 94774 W (0 err) |
-98821 R (0 err) | 98821 W (0 err) |
-102886 R (0 err) | 102886 W (0 err) | 114 lost |
-107046 R (0 err) | 107046 W (0 err) | 114 lost |
-

When I set the counter to 0 the real value was 114, so the program -reports 114 lost writes (INCR commands that are not -remembered by the cluster).

-

This program is much more interesting as a test case, so we’ll use it -to test the Valkey Cluster failover.

-

Test the failover

-

To trigger the failover, the simplest thing we can do (that is also -the semantically simplest failure that can occur in a distributed -system) is to crash a single process, in our case a single primary.

-

Note: During this test, you should take a tab open -with the consistency test application running.

-

We can identify a primary and crash it with the following -command:

-
$ valkey-cli -p 7000 cluster nodes | grep master
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385482984082 0 connected 5960-10921
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 master - 0 1385482983582 0 connected 11423-16383
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
-

Ok, so 7000, 7001, and 7002 are primaries. Let’s crash node 7002 with -the DEBUG SEGFAULT command:

-
$ valkey-cli -p 7002 debug segfault
-Error: Server closed the connection
-

Now we can look at the output of the consistency test to see what it -reported.

-
18849 R (0 err) | 18849 W (0 err) |
-23151 R (0 err) | 23151 W (0 err) |
-27302 R (0 err) | 27302 W (0 err) |
-
-... many error warnings here ...
-
-29659 R (578 err) | 29660 W (577 err) |
-33749 R (578 err) | 33750 W (577 err) |
-37918 R (578 err) | 37919 W (577 err) |
-42077 R (578 err) | 42078 W (577 err) |
-

As you can see during the failover the system was not able to accept -578 reads and 577 writes, however no inconsistency was created in the -database. This may sound unexpected as in the first part of this -tutorial we stated that Valkey Cluster can lose writes during the -failover because it uses asynchronous replication. What we did not say -is that this is not very likely to happen because Valkey sends the reply -to the client, and the commands to replicate to the replicas, about at -the same time, so there is a very small window to lose data. However the -fact that it is hard to trigger does not mean that it is impossible, so -this does not change the consistency guarantees provided by Valkey -cluster.

-

We can now check what is the cluster setup after the failover (note -that in the meantime I restarted the crashed instance so that it rejoins -the cluster as a replica):

-
$ valkey-cli -p 7000 cluster nodes
-3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385503418521 0 connected
-a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385503419023 0 connected
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
-3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385503419023 3 connected 11423-16383
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385503417005 0 connected 5960-10921
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385503418016 3 connected
-

Now the primaries are running on ports 7000, 7001 and 7005. What was -previously a primary, that is the Valkey instance running on port 7002, -is now a replica of 7005.

-

The output of the CLUSTER NODES command may look -intimidating, but it is actually pretty simple, and is composed of the -following tokens:

-
    -
  • Node ID
  • -
  • ip:port
  • -
  • flags: master, replica, myself, fail, …
  • -
  • if it is a replica, the Node ID of the master
  • -
  • Time of the last pending PING still waiting for a reply.
  • -
  • Time of the last PONG received.
  • -
  • Configuration epoch for this node (see the Cluster -specification).
  • -
  • Status of the link to this node.
  • -
  • Slots served…
  • -
-

Manual failover

-

Sometimes it is useful to force a failover without actually causing -any problem on a primary. For example, to upgrade the Valkey process of -one of the primary nodes it is a good idea to failover it to turn it -into a replica with minimal impact on availability.

-

Manual failovers are supported by Valkey Cluster using the -CLUSTER FAILOVER command, that must be executed in one of -the replicas of the primary you want to failover.

-

Manual failovers are special and are safer compared to failovers -resulting from actual primary failures. They occur in a way that avoids -data loss in the process, by switching clients from the original primary -to the new primary only when the system is sure that the new primary -processed all the replication stream from the old one.

-

This is what you see in the replica log when you perform a manual -failover:

-
# Manual failover user request accepted.
-# Received replication offset for paused primary manual failover: 347540
-# All primary replication stream processed, manual failover can start.
-# Start of election delayed for 0 milliseconds (rank #0, offset 347540).
-# Starting a failover election for epoch 7545.
-# Failover election won: I'm the new primary.
-

Clients sending write commands to the primary are blocked during the -failover. When the primary sends its replication offset to the replica, -the replica waits to reach the offset on its side. When the replication -offset is reached, the failover starts, and the old primary is informed -about the configuration switch. When the switch is complete, the clients -are unblocked on the old primary and they are redirected to the new -primary.

-

Note: To promote a replica to primary, it must first -be known as a replica by a majority of the primaries in the cluster. -Otherwise, it cannot win the failover election. If the replica has just -been added to the cluster (see Add a new node as a replica), -you may need to wait a while before sending the -CLUSTER FAILOVER command, to make sure the primaries in -cluster are aware of the new replica.

-

Add a new node

-

Adding a new node is basically the process of adding an empty node -and then moving some data into it, in case it is a new primary, or -telling it to setup as a replica of a known node, in case it is a -replica.

-

We’ll show both, starting with the addition of a new primary -instance.

-

In both cases the first step to perform is adding an empty -node.

-

This is as simple as to start a new node in port 7006 (we already -used from 7000 to 7005 for our existing 6 nodes) with the same -configuration used for the other nodes, except for the port number, so -what you should do in order to conform with the setup we used for the -previous nodes:

-
    -
  • Create a new tab in your terminal application.
  • -
  • Enter the cluster-test directory.
  • -
  • Create a directory named 7006.
  • -
  • Create a valkey.conf file inside, similar to the one used for the -other nodes but using 7006 as port number.
  • -
  • Finally start the server with -../valkey-server ./valkey.conf
  • -
-

At this point the server should be running.

-

Now we can use valkey-cli as usual in order to add -the node to the existing cluster.

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
-

As you can see I used the add-node command -specifying the address of the new node as first argument, and the -address of a random existing node in the cluster as second argument.

-

In practical terms valkey-cli here did very little to help us, it -just sent a CLUSTER MEET message to the node, something -that is also possible to accomplish manually. However valkey-cli also -checks the state of the cluster before to operate, so it is a good idea -to perform cluster operations always via valkey-cli even when you know -how the internals work.

-

Now we can connect to the new node to see if it really joined the -cluster:

-
valkey 127.0.0.1:7006> cluster nodes
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385543178575 0 connected 5960-10921
-3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385543179583 0 connected
-f093c80dde814da99c5cf72a7dd01590792b783b :0 myself,master - 0 0 0 connected
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543178072 3 connected
-a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385543178575 0 connected
-97a3a64667477371c4479320d683e4c8db5858b1 127.0.0.1:7000 master - 0 1385543179080 0 connected 0-5959 10922-11422
-3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385543177568 3 connected 11423-16383
-

Note that since this node is already connected to the cluster it is -already able to redirect client queries correctly and is generally -speaking part of the cluster. However it has two peculiarities compared -to the other primaries:

-
    -
  • It holds no data as it has no assigned hash slots.
  • -
  • Because it is a primary without assigned slots, it does not -participate in the election process when a replica wants to become a -primary.
  • -
-

Now it is possible to assign hash slots to this node using the -resharding feature of valkey-cli. It is basically useless -to show this as we already did in a previous section, there is no -difference, it is just a resharding having as a target the empty -node.

-
Add a new node as a replica
-

Adding a new replica can be performed in two ways. The obvious one is -to use valkey-cli again, but with the –cluster-replica option, like -this:

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica
-

Note that the command line here is exactly like the one we used to -add a new primary, so we are not specifying to which primary we want to -add the replica. In this case, what happens is that valkey-cli will add -the new node as replica of a random primary among the primaries with -fewer replicas.

-

However you can specify exactly what primary you want to target with -your new replica with the following command line:

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-

This way we assign the new replica to a specific primary.

-

A more manual way to add a replica to a specific primary is to add -the new node as an empty primary, and then turn it into a replica using -the CLUSTER REPLICATE command. This also works if the node -was added as a replica but you want to move it as a replica of a -different primary.

-

For example in order to add a replica for the node 127.0.0.1:7005 -that is currently serving hash slots in the range 11423-16383, that has -a Node ID 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e, all I need to do is -to connect with the new node (already added as empty primary) and send -the command:

-
valkey 127.0.0.1:7006> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-

That’s it. Now we have a new replica for this set of hash slots, and -all the other nodes in the cluster already know (after a few seconds -needed to update their config). We can verify with the following -command:

-
$ valkey-cli -p 7000 cluster nodes | grep slave | grep 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-f093c80dde814da99c5cf72a7dd01590792b783b 127.0.0.1:7006 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617702 3 connected
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617198 3 connected
-

The node 3c3a0c… now has two replicas, running on ports 7002 (the -existing one) and 7006 (the new one).

-

Remove a node

-

To remove a replica node just use the del-node command -of valkey-cli:

-
valkey-cli --cluster del-node 127.0.0.1:7000 `<node-id>`
-

The first argument is just a random node in the cluster, the second -argument is the ID of the node you want to remove.

-

You can remove a primary node in the same way as well, -however in order to remove a primary node it must be -empty. If the primary is not empty you need to reshard data -away from it to all the other primary nodes before.

-

An alternative to remove a primary node is to perform a manual -failover of it over one of its replicas and remove the node after it -turned into a replica of the new primary. Obviously this does not help -when you want to reduce the actual number of primaries in your cluster, -in that case, a resharding is needed.

-

There is a special scenario where you want to remove a failed node. -You should not use the del-node command because it tries to -connect to all nodes and you will encounter a “connection refused” -error. Instead, you can use the call command:

-
valkey-cli --cluster call 127.0.0.1:7000 cluster forget `<node-id>`
-

This command will execute CLUSTER FORGET command on -every node.

-

Replica migration

-

In Valkey Cluster, you can reconfigure a replica to replicate with a -different primary at any time just using this command:

-
CLUSTER REPLICATE <master-node-id>
-

However there is a special scenario where you want replicas to move -from one primary to another one automatically, without the help of the -system administrator. The automatic reconfiguration of replicas is -called replicas migration and is able to improve the -reliability of a Valkey Cluster.

-

Note: You can read the details of replicas migration -in the Valkey Cluster Specification, here -we’ll only provide some information about the general idea and what you -should do in order to benefit from it.

-

The reason why you may want to let your cluster replicas to move from -one primary to another under certain condition, is that usually the -Valkey Cluster is as resistant to failures as the number of replicas -attached to a given primary.

-

For example a cluster where every primary has a single replica can’t -continue operations if the primary and its replica fail at the same -time, simply because there is no other instance to have a copy of the -hash slots the primary was serving. However while net-splits are likely -to isolate a number of nodes at the same time, many other kind of -failures, like hardware or software failures local to a single node, are -a very notable class of failures that are unlikely to happen at the same -time, so it is possible that in your cluster where every primary has a -replica, the replica is killed at 4am, and the primary is killed at 6am. -This still will result in a cluster that can no longer operate.

-

To improve reliability of the system we have the option to add -additional replicas to every primary, but this is expensive. Replica -migration allows to add more replicas to just a few primaries. So you -have 10 primaries with 1 replica each, for a total of 20 instances. -However you add, for example, 3 instances more as replicas of some of -your primaries, so certain primaries will have more than a single -replica.

-

With replicas migration what happens is that if a primary is left -without replicas, a replica from a primary that has multiple replicas -will migrate to the orphaned primary. So after your replica -goes down at 4am as in the example we made above, another replica will -take its place, and when the primary will fail as well at 5am, there is -still a replica that can be elected so that the cluster can continue to -operate.

-

So what you should know about replicas migration in short?

-
    -
  • The cluster will try to migrate a replica from the primary that has -the greatest number of replicas in a given moment.
  • -
  • To benefit from replica migration you have just to add a few more -replicas to a single primary in your cluster, it does not matter what -primary.
  • -
  • There is a configuration parameter that controls the replica -migration feature that is called cluster-migration-barrier: -you can read more about it in the example valkey.conf file -provided with Valkey Cluster.
  • -
-

Upgrade nodes in a Valkey -Cluster

-

Upgrading replica nodes is easy since you just need to stop the node -and restart it with an updated version of Valkey. If there are clients -scaling reads using replica nodes, they should be able to reconnect to a -different replica if a given one is not available.

-

Upgrading primaries is a bit more complex. The suggested procedure is -to trigger a manual failover to turn the old primary into a replica and -then upgrading it.

-

A complete rolling upgrade of all nodes in a cluster can be performed -by repeating the following procedure for each shard (a primary and its -replicas):

-
    -
  1. Add one or more upgraded nodes as new replicas to the primary. -This step is optional but it ensures that the number of replicas is not -compromised during the rolling upgrade. To add a new node, use CLUSTER MEET and -CLUSTER REPLICATE -or use valkey-cli as described under Add a new node as a replica.

    -

    An alternative is to upgrade one replica at a time and have fewer -replicas online during the upgrade.

  2. -
  3. Upgrade the old replicas you want to keep by restarting them with -the updated version of Valkey. If you’re replacing all the old nodes -with new nodes, you can skip this step.

  4. -
  5. Select one of the upgraded replicas to be the new primary. Wait -until this replica has caught up the replication offset with the -primary. You can use INFO REPLICATION and check -for the line master_link_status:up to be present. This -indicates that the initial sync with the primary is complete.

    -

    After the initial full sync, the replica might still lag behind in -replication. Send INFO REPLICATION to the primary and the -replica and compare the field master_repl_offset returned -by both nodes. If the offsets match, it means that all writes have been -replicated. However, if the primary receives a constant stream of -writes, it’s possible that the offsets will never be equal. In this -step, you can accept a small difference. It’s usually enough to wait for -some seconds to minimize the difference.

  6. -
  7. Check that the new replica is known by all nodes in the cluster, -or at least by the primaries in the cluster. You can send CLUSTER NODES to -each of the nodes in the cluster and check that they all are aware of -the new node. Wait for some time and repeat the check if -necessary.

  8. -
  9. Trigger a manual failover by sending CLUSTER FAILOVER -to the replica node selected to become the new primary. See the Manual failover section in this document for -more information.

  10. -
  11. Wait for the failover to complete. To check, you can use ROLE, INFO REPLICATION (which -indicates role:master after successful failover) or CLUSTER NODES to -verify that the state of the cluster has changed shortly after the -command was sent.

  12. -
  13. Take the old primary (now a replica) out of service, or upgrade -it and add it again as a replica. Remove additional replicas kept for -redundancy during the upgrade, if any.

  14. -
-

Repeat this sequence for each shard (each primary and its replicas) -until all nodes in the cluster have been upgraded.

-

Migrate to Valkey Cluster

-

Users willing to migrate to Valkey Cluster may have just a single -primary, or may already using a preexisting sharding setup, where keys -are split among N nodes, using some in-house algorithm or a sharding -algorithm implemented by their client library or Valkey proxy.

-

In both cases it is possible to migrate to Valkey Cluster easily, -however what is the most important detail is if multiple-keys operations -are used by the application, and how. There are three different -cases:

-
    -
  1. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys, are not used. Keys are accessed independently (even if -accessed via transactions or Lua scripts grouping multiple commands, -about the same key, together).
  2. -
  3. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys are used but only with keys having the same hash -tag, which means that the keys used together all have a -{...} sub-string that happens to be identical. For example -the following multiple keys operation is defined in the context of the -same hash tag: SUNION {user:1000}.foo {user:1000}.bar.
  4. -
  5. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys are used with key names not having an explicit, or the -same, hash tag.
  6. -
-

The third case is not handled by Valkey Cluster: the application -requires to be modified in order to not use multi keys operations or -only use them in the context of the same hash tag.

-

Case 1 and 2 are covered, so we’ll focus on those two cases, that are -handled in the same way, so no distinction will be made in the -documentation.

-

Assuming you have your preexisting data set split into N primaries, -where N=1 if you have no preexisting sharding, the following steps are -needed in order to migrate your data set to Valkey Cluster:

-
    -
  1. Stop your clients. No automatic live-migration to Valkey Cluster is -currently possible. You may be able to do it orchestrating a live -migration in the context of your application / environment.
  2. -
  3. Generate an append only file for all of your N primaries using the -BGREWRITEAOF command, and waiting for the AOF file to be -completely generated.
  4. -
  5. Save your AOF files from aof-1 to aof-N somewhere. At this point you -can stop your old instances if you wish (this is useful since in -non-virtualized deployments you often need to reuse the same -computers).
  6. -
  7. Create a Valkey Cluster composed of N primaries and zero replicas. -You’ll add replicas later. Make sure all your nodes are using the append -only file for persistence.
  8. -
  9. Stop all the cluster nodes, substitute their append only file with -your pre-existing append only files, aof-1 for the first node, aof-2 for -the second node, up to aof-N.
  10. -
  11. Restart your Valkey Cluster nodes with the new AOF files. They’ll -complain that there are keys that should not be there according to their -configuration.
  12. -
  13. Use valkey-cli --cluster fix command in order to fix -the cluster so that keys will be migrated according to the hash slots -each node is authoritative or not.
  14. -
  15. Use valkey-cli --cluster check at the end to make sure -your cluster is ok.
  16. -
  17. Restart your clients modified to use a Valkey Cluster aware client -library.
  18. -
-

There is an alternative way to import data from external instances to -a Valkey Cluster, which is to use the -valkey-cli --cluster import command.

-

The command moves all the keys of a running instance (deleting the -keys from the source instance) to the specified pre-existing Valkey -Cluster.

-

Note: If not for backward compatibility, the Valkey -project no longer uses the words “master” and “slave”. Unfortunately in -this command these words are part of the protocol, so we’ll be able to -remove such occurrences only when this API will be naturally -deprecated.

-

Learn more

- - - diff --git a/_test/cluster-tutorial.html b/_test/cluster-tutorial.html deleted file mode 100644 index 4b48a41a4..000000000 --- a/_test/cluster-tutorial.html +++ /dev/null @@ -1,1284 +0,0 @@ - - - - - - - - Cluster tutorial - - - -
-

Cluster tutorial

-
-

Valkey scales horizontally with a deployment topology called Valkey -Cluster. This topic will teach you how to set up, test, and operate -Valkey Cluster in production. You will learn about the availability and -consistency characteristics of Valkey Cluster from the end user’s point -of view.

-

If you plan to run a production Valkey Cluster deployment or want to -understand better how Valkey Cluster works internally, consult the Valkey Cluster specification.

-

Valkey Cluster 101

-

Valkey Cluster provides a way to run a Valkey installation where data -is automatically sharded across multiple Valkey nodes. Valkey Cluster -also provides some degree of availability during partitions—in practical -terms, the ability to continue operations when some nodes fail or are -unable to communicate. However, the cluster will become unavailable in -the event of larger failures (for example, when the majority of -primaries are unavailable).

-

So, with Valkey Cluster, you get the ability to:

-
    -
  • Automatically split your dataset among multiple nodes.
  • -
  • Continue operations when a subset of the nodes are experiencing -failures or are unable to communicate with the rest of the cluster.
  • -
-

Valkey Cluster TCP ports

-

Every Valkey Cluster node requires two open TCP connections: a Valkey -TCP port used to serve clients, e.g., 6379, and second port known as the -cluster bus port. By default, the cluster bus port is set by -adding 10000 to the data port (e.g., 16379); however, you can override -this in the cluster-port configuration.

-

Cluster bus is a node-to-node communication channel that uses a -binary protocol, which is more suited to exchanging information between -nodes due to little bandwidth and processing time. Nodes use the cluster -bus for failure detection, configuration updates, failover -authorization, and so forth. Clients should never try to communicate -with the cluster bus port, but rather use the Valkey command port. -However, make sure you open both ports in your firewall, otherwise -Valkey cluster nodes won’t be able to communicate.

-

For a Valkey Cluster to work properly you need, for each node:

-
    -
  1. The client communication port (usually 6379) used to communicate -with clients and be open to all the clients that need to reach the -cluster, plus all the other cluster nodes that use the client port for -key migrations.
  2. -
  3. The cluster bus port must be reachable from all the other cluster -nodes.
  4. -
-

If you don’t open both TCP ports, your cluster will not work as -expected.

-

Valkey Cluster and Docker

-

Currently, Valkey Cluster does not support NATted environments and in -general environments where IP addresses or TCP ports are remapped.

-

Docker uses a technique called port mapping: programs -running inside Docker containers may be exposed with a different port -compared to the one the program believes to be using. This is useful for -running multiple containers using the same ports, at the same time, in -the same server.

-

To make Docker compatible with Valkey Cluster, you need to use -Docker’s host networking mode. Please see the ---net=host option in the Docker -documentation for more information.

-

Valkey Cluster data sharding

-

Valkey Cluster does not use consistent hashing, but a different form -of sharding where every key is conceptually part of what we call a -hash slot.

-

There are 16384 hash slots in Valkey Cluster, and to compute the hash -slot for a given key, we simply take the CRC16 of the key modulo -16384.

-

Every node in a Valkey Cluster is responsible for a subset of the -hash slots, so, for example, you may have a cluster with 3 nodes, -where:

-
    -
  • Node A contains hash slots from 0 to 5500.
  • -
  • Node B contains hash slots from 5501 to 11000.
  • -
  • Node C contains hash slots from 11001 to 16383.
  • -
-

This makes it easy to add and remove cluster nodes. For example, if I -want to add a new node D, I need to move some hash slots from nodes A, -B, C to D. Similarly, if I want to remove node A from the cluster, I can -just move the hash slots served by A to B and C. Once node A is empty, I -can remove it from the cluster completely.

-

Moving hash slots from a node to another does not require stopping -any operations; therefore, adding and removing nodes, or changing the -percentage of hash slots held by a node, requires no downtime.

-

Valkey Cluster supports multiple key operations as long as all of the -keys involved in a single command execution (or whole transaction, or -Lua script execution) belong to the same hash slot. The user can force -multiple keys to be part of the same hash slot by using a feature called -hash tags.

-

Hash tags are documented in the Valkey Cluster specification, but the -gist is that if there is a substring between {} brackets in a key, only -what is inside the string is hashed. For example, the keys -user:{123}:profile and user:{123}:account are -guaranteed to be in the same hash slot because they share the same hash -tag. As a result, you can operate on these two keys in the same -multi-key operation.

-

Valkey Cluster -primary-replica model

-

To remain available when a subset of primary nodes are failing or are -not able to communicate with the majority of nodes, Valkey Cluster uses -a primary-replica model where every hash slot has from 1 (the primary -itself) to N replicas (N-1 additional replica nodes).

-

In our example cluster with nodes A, B, C, if node B fails the -cluster is not able to continue, since we no longer have a way to serve -hash slots in the range 5501-11000.

-

However, when the cluster is created (or at a later time), we add a -replica node to every primary, so that the final cluster is composed of -A, B, C that are primary nodes, and A1, B1, C1 that are replica nodes. -This way, the system can continue if node B fails.

-

Node B1 replicates B, and B fails, the cluster will promote node B1 -as the new primary and will continue to operate correctly.

-

However, note that if nodes B and B1 fail at the same time, Valkey -Cluster will not be able to continue to operate.

-

Valkey Cluster -consistency guarantees

-

Valkey Cluster does not guarantee strong -consistency. In practical terms this means that under certain -conditions it is possible that Valkey Cluster will lose writes that were -acknowledged by the system to the client.

-

The first reason why Valkey Cluster can lose writes is because it -uses asynchronous replication. This means that during writes the -following happens:

-
    -
  • Your client writes to the primary B.
  • -
  • The primary B replies OK to your client.
  • -
  • The primary B propagates the write to its replicas B1, B2 and -B3.
  • -
-

As you can see, B does not wait for an acknowledgement from B1, B2, -B3 before replying to the client, since this would be a prohibitive -latency penalty for Valkey, so if your client writes something, B -acknowledges the write, but crashes before being able to send the write -to its replicas, one of the replicas (that did not receive the write) -can be promoted to primary, losing the write forever.

-

This is very similar to what happens with most databases that are -configured to flush data to disk every second, so it is a scenario you -are already able to reason about because of past experiences with -traditional database systems not involving distributed systems. -Similarly you can improve consistency by forcing the database to flush -data to disk before replying to the client, but this usually results in -prohibitively low performance. That would be the equivalent of -synchronous replication in the case of Valkey Cluster.

-

Basically, there is a trade-off to be made between performance and -consistency.

-

Valkey Cluster has support for synchronous writes when absolutely -needed, implemented via the WAIT command. This makes losing -writes a lot less likely. However, note that Valkey Cluster does not -implement strong consistency even when synchronous replication is used: -it is always possible, under more complex failure scenarios, that a -replica that was not able to receive the write will be elected as -primary.

-

There is another notable scenario where Valkey Cluster will lose -writes, that happens during a network partition where a client is -isolated with a minority of instances including at least a primary.

-

Take as an example our 6 nodes cluster composed of A, B, C, A1, B1, -C1, with 3 primaries and 3 replicas. There is also a client, that we -will call Z1.

-

After a partition occurs, it is possible that in one side of the -partition we have A, C, A1, B1, C1, and in the other side we have B and -Z1.

-

Z1 is still able to write to B, which will accept its writes. If the -partition heals in a very short time, the cluster will continue -normally. However, if the partition lasts enough time for B1 to be -promoted to primary on the majority side of the partition, the writes -that Z1 has sent to B in the meantime will be lost.

-

Note: There is a maximum window to -the amount of writes Z1 will be able to send to B: if enough time has -elapsed for the majority side of the partition to elect a replica as -primary, every primary node in the minority side will have stopped -accepting writes.

-

This amount of time is a very important configuration directive of -Valkey Cluster, and is called the node timeout.

-

After node timeout has elapsed, a primary node is considered to be -failing, and can be replaced by one of its replicas. Similarly, after -node timeout has elapsed without a primary node to be able to sense the -majority of the other primary nodes, it enters an error state and stops -accepting writes.

-

Valkey Cluster -configuration parameters

-

We are about to create an example cluster deployment. Before we -continue, let’s introduce the configuration parameters that Valkey -Cluster introduces in the valkey.conf file.

-
    -
  • cluster-enabled <yes/no>: If -yes, enables Valkey Cluster support in a specific Valkey instance. -Otherwise the instance starts as a standalone instance as usual.
  • -
  • cluster-config-file <filename>: -Note that despite the name of this option, this is not a user editable -configuration file, but the file where a Valkey Cluster node -automatically persists the cluster configuration (the state, basically) -every time there is a change, in order to be able to re-read it at -startup. The file lists things like the other nodes in the cluster, -their state, persistent variables, and so forth. Often this file is -rewritten and flushed on disk as a result of some message -reception.
  • -
  • cluster-node-timeout -<milliseconds>: The maximum amount of time a -Valkey Cluster node can be unavailable, without it being considered as -failing. If a primary node is not reachable for more than the specified -amount of time, it will be failed over by its replicas. This parameter -controls other important things in Valkey Cluster. Notably, every node -that can’t reach the majority of primary nodes for the specified amount -of time, will stop accepting queries.
  • -
  • cluster-replica-validity-factor -<factor>: If set to zero, a replica will -always consider itself valid, and will therefore always try to failover -a primary, regardless of the amount of time the link between the primary -and the replica remained disconnected. If the value is positive, a -maximum disconnection time is calculated as the node timeout -value multiplied by the factor provided with this option, and if the -node is a replica, it will not try to start a failover if the primary -link was disconnected for more than the specified amount of time. For -example, if the node timeout is set to 5 seconds and the validity factor -is set to 10, a replica disconnected from the primary for more than 50 -seconds will not try to failover its primary. Note that any value -different than zero may result in Valkey Cluster being unavailable after -a primary failure if there is no replica that is able to failover it. In -that case the cluster will return to being available only when the -original primary rejoins the cluster.
  • -
  • cluster-migration-barrier -<count>: Minimum number of replicas a -primary will remain connected with, for another replica to migrate to a -primary which is no longer covered by any replica. See the appropriate -section about replica migration in this tutorial for more -information.
  • -
  • cluster-require-full-coverage -<yes/no>: If this is set to yes, as it is by -default, the cluster stops accepting writes if some percentage of the -key space is not covered by any node. If the option is set to no, the -cluster will still serve queries even if only requests about a subset of -keys can be processed.
  • -
  • cluster-allow-reads-when-down -<yes/no>: If this is set to no, as it is by -default, a node in a Valkey Cluster will stop serving all traffic when -the cluster is marked as failed, either when a node can’t reach a quorum -of primaries or when full coverage is not met. This prevents reading -potentially inconsistent data from a node that is unaware of changes in -the cluster. This option can be set to yes to allow reads from a node -during the fail state, which is useful for applications that want to -prioritize read availability but still want to prevent inconsistent -writes. It can also be used for when using Valkey Cluster with only one -or two shards, as it allows the nodes to continue serving writes when a -primary fails but automatic failover is impossible.
  • -
-

Create and use a Valkey -Cluster

-

To create and use a Valkey Cluster, follow these steps:

- -

But, first, familiarize yourself with the requirements for creating a -cluster.

-

Requirements to create -a Valkey Cluster

-

To create a cluster, the first thing you need is to have a few empty -Valkey instances running in cluster mode.

-

At minimum, set the following directives in the -valkey.conf file:

-
port 7000
-cluster-enabled yes
-cluster-config-file nodes.conf
-cluster-node-timeout 5000
-appendonly yes
-

To enable cluster mode, set the cluster-enabled -directive to yes. Every instance also contains the path of -a file where the configuration for this node is stored, which by default -is nodes.conf. This file is never touched by humans; it is -simply generated at startup by the Valkey Cluster instances, and updated -every time it is needed.

-

Note that the minimal cluster that works as expected -must contain at least three primary nodes. For deployment, we strongly -recommend a six-node cluster, with three primaries and three -replicas.

-

You can test this locally by creating the following directories named -after the port number of the instance you’ll run inside any given -directory.

-

For example:

-
mkdir cluster-test
-cd cluster-test
-mkdir 7000 7001 7002 7003 7004 7005
-

Create a valkey.conf file inside each of the -directories, from 7000 to 7005. As a template for your configuration -file just use the small example above, but make sure to replace the port -number 7000 with the right port number according to the -directory name.

-

You can start each instance as follows, each running in a separate -terminal tab:

-
cd 7000
-valkey-server ./valkey.conf
-

You’ll see from the logs that every node assigns itself a new ID:

-
[82462] 26 Nov 11:56:55.329 * No cluster configuration found, I'm 97a3a64667477371c4479320d683e4c8db5858b1
-

This ID will be used forever by this specific instance in order for -the instance to have a unique name in the context of the cluster. Every -node remembers every other node using this IDs, and not by IP or port. -IP addresses and ports may change, but the unique node identifier will -never change for all the life of the node. We call this identifier -simply Node ID.

-

Create a Valkey Cluster

-

Now that we have a number of instances running, you need to create -your cluster by writing some meaningful configuration to the nodes.

-

You can configure and execute individual instances manually or use -the create-cluster script. Let’s go over how you do it manually.

-

To create the cluster, run:

-
valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
-127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
---cluster-replicas 1
-

The command used here is create, since we want to -create a new cluster. The option --cluster-replicas 1 means -that we want a replica for every primary created.

-

The other arguments are the list of addresses of the instances I want -to use to create the new cluster.

-

valkey-cli will propose a configuration. Accept the -proposed configuration by typing yes. The cluster will -be configured and joined, which means that instances will be -bootstrapped into talking with each other. Finally, if everything has -gone well, you’ll see a message like this:

-
[OK] All 16384 slots covered
-

This means that there is at least one primary instance serving each -of the 16384 available slots.

-

If you don’t want to create a Valkey Cluster by configuring and -executing individual instances manually as explained above, there is a -much simpler system (but you’ll not learn the same amount of operational -details).

-

Find the utils/create-cluster directory in the Valkey -distribution. There is a script called create-cluster -inside (same name as the directory it is contained into), it’s a simple -bash script. In order to start a 6 nodes cluster with 3 primaries and 3 -replicas just type the following commands:

-
    -
  1. create-cluster start
  2. -
  3. create-cluster create
  4. -
-

Reply to yes in step 2 when the valkey-cli -utility wants you to accept the cluster layout.

-

You can now interact with the cluster, the first node will start at -port 30001 by default. When you are done, stop the cluster with:

-
    -
  1. create-cluster stop
  2. -
-

Please read the README inside this directory for more -information on how to run the script.

-

Interact with the cluster

-

To connect to Valkey Cluster, you’ll need a cluster-aware Valkey -client. See the documentation for your client of -choice to determine its cluster support.

-

You can also test your Valkey Cluster using the -valkey-cli command line utility:

-
$ valkey-cli -c -p 7000
-127.0.0.1:7000> set foo bar
--> Redirected to slot [12182] located at 127.0.0.1:7002
-OK
-127.0.0.1:7002> set hello world
--> Redirected to slot [866] located at 127.0.0.1:7000
-OK
-127.0.0.1:7000> get foo
--> Redirected to slot [12182] located at 127.0.0.1:7002
-"bar"
-127.0.0.1:7002> get hello
--> Redirected to slot [866] located at 127.0.0.1:7000
-"world"
-

Note: If you created the cluster using the script, -your nodes may listen on different ports, starting from 30001 by -default.

-

The valkey-cli cluster support is very basic, so it -always uses the fact that Valkey Cluster nodes are able to redirect a -client to the right node. A serious client is able to do better than -that, and cache the map between hash slots and nodes addresses, to -directly use the right connection to the right node. The map is -refreshed only when something changed in the cluster configuration, for -example after a failover or after the system administrator changed the -cluster layout by adding or removing nodes.

-

Write an example app -with Valkey GLIDE

-

Before going forward showing how to operate the Valkey Cluster, doing -things like a failover, or a resharding, we need to create some example -application or at least to be able to understand the semantics of a -simple Valkey Cluster client interaction.

-

In this way we can run an example and at the same time try to make -nodes failing, or start a resharding, to see how Valkey Cluster behaves -under real world conditions. It is not very helpful to see what happens -while nobody is writing to the cluster.

-

This section explains some basic usage of Valkey -GLIDE for Node.js, the official Valkey client library, showing a -simple example application.

-

The following example demonstrates how to connect to a Valkey cluster -and perform basic operations. First, install the Valkey GLIDE -client:

-
npm install @valkey/valkey-glide
-

Here’s the example code:

-
import { GlideClusterClient } from "@valkey/valkey-glide";
-
-async function runExample() {
-    const addresses = [
-        {
-            host: "localhost",
-            port: 6379,
-        },
-    ];
-    // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options.
-    const client = await GlideClusterClient.createClient({
-        addresses: addresses,
-        // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently.
-        // useTLS: true,
-        // It is recommended to set a timeout for your specific use case
-        requestTimeout: 500, // 500ms timeout
-        clientName: "test_cluster_client",
-    });
-
-    try {
-
-        console.log("Connected to Valkey cluster");
-
-        // Get the last counter value, or start from 0
-        let last = await client.get("__last__");
-        last = last ? parseInt(last) : 0;
-
-        console.log(`Starting from counter: ${last}`);
-
-        // Write keys in batches using mset for better performance
-        const batchSize = 100;
-        for (let start = last + 1; start <= 1000000000; start += batchSize) {
-            try {
-                const keyValuePairs = [];
-                const end = Math.min(start + batchSize - 1, 1000000000);
-                
-                // Prepare batch of key-value pairs as array
-                for (let x = start; x <= end; x++) {
-                    keyValuePairs.push(`foo${x}`, x.toString());
-                }
-                
-                // Execute batch mset with array format
-                await client.mset(keyValuePairs);
-                
-                // Update counter and display progress
-                await client.set("__last__", end.toString());
-                console.log(`Batch completed: ${start} to ${end}`);
-                
-                // Verify a sample key from the batch
-                const sampleKey = `foo${start}`;
-                const value = await client.get(sampleKey);
-                console.log(`Sample verification - ${sampleKey}: ${value}`);
-                
-            } catch (error) {
-                console.log(`Error in batch starting at ${start}: ${error.message}`);
-            }
-        }
-    } catch (error) {
-        console.log(`Connection error: ${error.message}`);
-    } finally {
-        client.close();
-    }
-}
-
-runExample().catch(console.error);
-

The application does a very simple thing, it sets keys in the form -foo<number> to number, using batched -MSET operations for better performance. The MSET command accepts an -array of alternating keys and values. So if you run the program the -result is batches of MSET commands:

-
    -
  • MSET foo1 1 foo2 2 foo3 3 … foo100 100 (batch of 100 keys)
  • -
  • MSET foo101 101 foo102 102 … foo200 200 (next batch)
  • -
  • And so forth…
  • -
-

The program includes comprehensive error handling to display errors -instead of crashing, so all cluster operations are wrapped in try-catch -blocks.

-

The client creation section is the first key part of -the program. It creates the Valkey cluster client using a list of -cluster addresses and configuration options including a request -timeout and client name.

-

The addresses don’t need to be all the nodes of the cluster. The -important thing is that at least one node is reachable. Valkey GLIDE -automatically discovers the complete cluster topology once it connects -to any node.

-

Now that we have the cluster client instance, we can use it like any -other Valkey client to perform operations across the cluster.

-

The counter initialization section reads a counter -so that when we restart the example we don’t start again with -foo0, but continue from where we left off. The counter is -stored in Valkey itself using the key __last__.

-

The main processing loop sets keys in batches using -MSET operations for better performance, processing 100 keys at a time -and displaying progress or any errors that occur. you’ll get the usually -10k ops/second in the best of the conditions).

-

you’ll get optimal performance).

-

Starting the application produces the following output:

-
node example.js
-Connected to Valkey cluster
-Starting from counter: 0
-Batch completed: 1 to 100
-Sample verification - foo1: 1
-Batch completed: 101 to 200
-Sample verification - foo101: 101
-Batch completed: 201 to 300
-Sample verification - foo201: 201
-^C (I stopped the program here)
-

This is not a very interesting program and we’ll use a better one in -a moment but we can already see what happens during a resharding when -the program is running.

-

Reshard the cluster

-

Now we are ready to try a cluster resharding. To do this, please keep -the example.js program running, so that you can see if there is some -impact on the program running.

-

Resharding basically means to move hash slots from a set of nodes to -another set of nodes. Like cluster creation, it is accomplished using -the valkey-cli utility.

-

To start a resharding, just type:

-
valkey-cli --cluster reshard 127.0.0.1:7000
-

You only need to specify a single node, valkey-cli will find the -other nodes automatically.

-

Currently valkey-cli is only able to reshard with the administrator -support, you can’t just say move 5% of slots from this node to the other -one (but this is pretty trivial to implement). So it starts with -questions. The first is how much of a resharding do you want to do:

-
How many slots do you want to move (from 1 to 16384)?
-

We can try to reshard 1000 hash slots, that should already contain a -non trivial amount of keys if the example is still running without the -sleep call.

-

Then valkey-cli needs to know what is the target of the resharding, -that is, the node that will receive the hash slots. I’ll use the first -primary node, that is, 127.0.0.1:7000, but I need to specify the Node ID -of the instance. This was already printed in a list by valkey-cli, but I -can always find the ID of a node with the following command if I -need:

-
$ valkey-cli -p 7000 cluster nodes | grep myself
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5460
-

Ok so my target node is 97a3a64667477371c4479320d683e4c8db5858b1.

-

Now you’ll get asked from what nodes you want to take those keys. -I’ll just type all in order to take a bit of hash slots -from all the other primary nodes.

-

After the final confirmation you’ll see a message for every slot that -valkey-cli is going to move from a node to another, and a dot will be -printed for every actual key moved from one side to the other.

-

While the resharding is in progress you should be able to see your -example program running unaffected. You can stop and restart it multiple -times during the resharding if you want.

-

At the end of the resharding, you can test the health of the cluster -with the following command:

-
valkey-cli --cluster check 127.0.0.1:7000
-

All the slots will be covered as usual, but this time the primary at -127.0.0.1:7000 will have more hash slots, something around 6461.

-

Resharding can be performed automatically without the need to -manually enter the parameters in an interactive way. This is possible -using a command line like the following:

-
valkey-cli --cluster reshard <host>:<port> --cluster-from <node-id> --cluster-to <node-id> --cluster-slots <number of slots> --cluster-yes
-

This allows to build some automatism if you are likely to reshard -often, however currently there is no way for valkey-cli to -automatically rebalance the cluster checking the distribution of keys -across the cluster nodes and intelligently moving slots as needed. This -feature will be added in the future.

-

The --cluster-yes option instructs the cluster manager -to automatically answer “yes” to the command’s prompts, allowing it to -run in a non-interactive mode. Note that this option can also be -activated by setting the REDISCLI_CLUSTER_YES environment -variable.

-

A more interesting -example application

-

The example application we wrote early is not very good. It writes to -the cluster in a simple way without even checking if what was written is -the right thing.

-

From our point of view the cluster receiving the writes could just -always write the key foo to 42 to every -operation, and we would not notice at all.

-

Now we can write a more interesting application for testing cluster -behavior. A simple consistency checking application that uses a set of -counters, by default 1000, and sends INCR commands to -increment the counters.

-

However instead of just writing, the application does two additional -things:

-
    -
  • When a counter is updated using INCR, the application -remembers the write.
  • -
  • It also reads a random counter before every write, and check if the -value is what we expected it to be, comparing it with the value it has -in memory.
  • -
-

What this means is that this application is a simple -consistency checker, and is able to tell you if the -cluster lost some write, or if it accepted a write that we did not -receive acknowledgment for. In the first case we’ll see a counter having -a value that is smaller than the one we remember, while in the second -case the value will be greater.

-

Running a consistency testing application produces a line of output -every second:

-
node consistency-test.js
-925 R (0 err) | 925 W (0 err) |
-5030 R (0 err) | 5030 W (0 err) |
-9261 R (0 err) | 9261 W (0 err) |
-13517 R (0 err) | 13517 W (0 err) |
-17780 R (0 err) | 17780 W (0 err) |
-22025 R (0 err) | 22025 W (0 err) |
-25818 R (0 err) | 25818 W (0 err) |
-

The line shows the number of Reads and -Writes performed, and the number of errors (query not -accepted because of errors since the system was not available).

-

If some inconsistency is found, new lines are added to the output. -This is what happens, for example, if I reset a counter manually while -the program is running:

-
$ valkey-cli -h 127.0.0.1 -p 7000 set key_217 0
-OK
-
-(in the other tab I see...)
-
-94774 R (0 err) | 94774 W (0 err) |
-98821 R (0 err) | 98821 W (0 err) |
-102886 R (0 err) | 102886 W (0 err) | 114 lost |
-107046 R (0 err) | 107046 W (0 err) | 114 lost |
-

When I set the counter to 0 the real value was 114, so the program -reports 114 lost writes (INCR commands that are not -remembered by the cluster).

-

This program is much more interesting as a test case, so we’ll use it -to test the Valkey Cluster failover.

-

Test the failover

-

To trigger the failover, the simplest thing we can do (that is also -the semantically simplest failure that can occur in a distributed -system) is to crash a single process, in our case a single primary.

-

Note: During this test, you should take a tab open -with the consistency test application running.

-

We can identify a primary and crash it with the following -command:

-
$ valkey-cli -p 7000 cluster nodes | grep master
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385482984082 0 connected 5960-10921
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 master - 0 1385482983582 0 connected 11423-16383
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
-

Ok, so 7000, 7001, and 7002 are primaries. Let’s crash node 7002 with -the DEBUG SEGFAULT command:

-
$ valkey-cli -p 7002 debug segfault
-Error: Server closed the connection
-

Now we can look at the output of the consistency test to see what it -reported.

-
18849 R (0 err) | 18849 W (0 err) |
-23151 R (0 err) | 23151 W (0 err) |
-27302 R (0 err) | 27302 W (0 err) |
-
-... many error warnings here ...
-
-29659 R (578 err) | 29660 W (577 err) |
-33749 R (578 err) | 33750 W (577 err) |
-37918 R (578 err) | 37919 W (577 err) |
-42077 R (578 err) | 42078 W (577 err) |
-

As you can see during the failover the system was not able to accept -578 reads and 577 writes, however no inconsistency was created in the -database. This may sound unexpected as in the first part of this -tutorial we stated that Valkey Cluster can lose writes during the -failover because it uses asynchronous replication. What we did not say -is that this is not very likely to happen because Valkey sends the reply -to the client, and the commands to replicate to the replicas, about at -the same time, so there is a very small window to lose data. However the -fact that it is hard to trigger does not mean that it is impossible, so -this does not change the consistency guarantees provided by Valkey -cluster.

-

We can now check what is the cluster setup after the failover (note -that in the meantime I restarted the crashed instance so that it rejoins -the cluster as a replica):

-
$ valkey-cli -p 7000 cluster nodes
-3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385503418521 0 connected
-a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385503419023 0 connected
-97a3a64667477371c4479320d683e4c8db5858b1 :0 myself,master - 0 0 0 connected 0-5959 10922-11422
-3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385503419023 3 connected 11423-16383
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385503417005 0 connected 5960-10921
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385503418016 3 connected
-

Now the primaries are running on ports 7000, 7001 and 7005. What was -previously a primary, that is the Valkey instance running on port 7002, -is now a replica of 7005.

-

The output of the CLUSTER NODES command may look -intimidating, but it is actually pretty simple, and is composed of the -following tokens:

-
    -
  • Node ID
  • -
  • ip:port
  • -
  • flags: master, replica, myself, fail, …
  • -
  • if it is a replica, the Node ID of the master
  • -
  • Time of the last pending PING still waiting for a reply.
  • -
  • Time of the last PONG received.
  • -
  • Configuration epoch for this node (see the Cluster -specification).
  • -
  • Status of the link to this node.
  • -
  • Slots served…
  • -
-

Manual failover

-

Sometimes it is useful to force a failover without actually causing -any problem on a primary. For example, to upgrade the Valkey process of -one of the primary nodes it is a good idea to failover it to turn it -into a replica with minimal impact on availability.

-

Manual failovers are supported by Valkey Cluster using the -CLUSTER FAILOVER command, that must be executed in one of -the replicas of the primary you want to failover.

-

Manual failovers are special and are safer compared to failovers -resulting from actual primary failures. They occur in a way that avoids -data loss in the process, by switching clients from the original primary -to the new primary only when the system is sure that the new primary -processed all the replication stream from the old one.

-

This is what you see in the replica log when you perform a manual -failover:

-
# Manual failover user request accepted.
-# Received replication offset for paused primary manual failover: 347540
-# All primary replication stream processed, manual failover can start.
-# Start of election delayed for 0 milliseconds (rank #0, offset 347540).
-# Starting a failover election for epoch 7545.
-# Failover election won: I'm the new primary.
-

Clients sending write commands to the primary are blocked during the -failover. When the primary sends its replication offset to the replica, -the replica waits to reach the offset on its side. When the replication -offset is reached, the failover starts, and the old primary is informed -about the configuration switch. When the switch is complete, the clients -are unblocked on the old primary and they are redirected to the new -primary.

-

Note: To promote a replica to primary, it must first -be known as a replica by a majority of the primaries in the cluster. -Otherwise, it cannot win the failover election. If the replica has just -been added to the cluster (see Add a new node as a replica), -you may need to wait a while before sending the -CLUSTER FAILOVER command, to make sure the primaries in -cluster are aware of the new replica.

-

Add a new node

-

Adding a new node is basically the process of adding an empty node -and then moving some data into it, in case it is a new primary, or -telling it to setup as a replica of a known node, in case it is a -replica.

-

We’ll show both, starting with the addition of a new primary -instance.

-

In both cases the first step to perform is adding an empty -node.

-

This is as simple as to start a new node in port 7006 (we already -used from 7000 to 7005 for our existing 6 nodes) with the same -configuration used for the other nodes, except for the port number, so -what you should do in order to conform with the setup we used for the -previous nodes:

-
    -
  • Create a new tab in your terminal application.
  • -
  • Enter the cluster-test directory.
  • -
  • Create a directory named 7006.
  • -
  • Create a valkey.conf file inside, similar to the one used for the -other nodes but using 7006 as port number.
  • -
  • Finally start the server with -../valkey-server ./valkey.conf
  • -
-

At this point the server should be running.

-

Now we can use valkey-cli as usual in order to add -the node to the existing cluster.

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
-

As you can see I used the add-node command -specifying the address of the new node as first argument, and the -address of a random existing node in the cluster as second argument.

-

In practical terms valkey-cli here did very little to help us, it -just sent a CLUSTER MEET message to the node, something -that is also possible to accomplish manually. However valkey-cli also -checks the state of the cluster before to operate, so it is a good idea -to perform cluster operations always via valkey-cli even when you know -how the internals work.

-

Now we can connect to the new node to see if it really joined the -cluster:

-
valkey 127.0.0.1:7006> cluster nodes
-3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master - 0 1385543178575 0 connected 5960-10921
-3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385543179583 0 connected
-f093c80dde814da99c5cf72a7dd01590792b783b :0 myself,master - 0 0 0 connected
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543178072 3 connected
-a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385543178575 0 connected
-97a3a64667477371c4479320d683e4c8db5858b1 127.0.0.1:7000 master - 0 1385543179080 0 connected 0-5959 10922-11422
-3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master - 0 1385543177568 3 connected 11423-16383
-

Note that since this node is already connected to the cluster it is -already able to redirect client queries correctly and is generally -speaking part of the cluster. However it has two peculiarities compared -to the other primaries:

-
    -
  • It holds no data as it has no assigned hash slots.
  • -
  • Because it is a primary without assigned slots, it does not -participate in the election process when a replica wants to become a -primary.
  • -
-

Now it is possible to assign hash slots to this node using the -resharding feature of valkey-cli. It is basically useless -to show this as we already did in a previous section, there is no -difference, it is just a resharding having as a target the empty -node.

-
Add a new node as a replica
-

Adding a new replica can be performed in two ways. The obvious one is -to use valkey-cli again, but with the –cluster-replica option, like -this:

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica
-

Note that the command line here is exactly like the one we used to -add a new primary, so we are not specifying to which primary we want to -add the replica. In this case, what happens is that valkey-cli will add -the new node as replica of a random primary among the primaries with -fewer replicas.

-

However you can specify exactly what primary you want to target with -your new replica with the following command line:

-
valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-

This way we assign the new replica to a specific primary.

-

A more manual way to add a replica to a specific primary is to add -the new node as an empty primary, and then turn it into a replica using -the CLUSTER REPLICATE command. This also works if the node -was added as a replica but you want to move it as a replica of a -different primary.

-

For example in order to add a replica for the node 127.0.0.1:7005 -that is currently serving hash slots in the range 11423-16383, that has -a Node ID 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e, all I need to do is -to connect with the new node (already added as empty primary) and send -the command:

-
valkey 127.0.0.1:7006> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-

That’s it. Now we have a new replica for this set of hash slots, and -all the other nodes in the cluster already know (after a few seconds -needed to update their config). We can verify with the following -command:

-
$ valkey-cli -p 7000 cluster nodes | grep slave | grep 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
-f093c80dde814da99c5cf72a7dd01590792b783b 127.0.0.1:7006 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617702 3 connected
-2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 replica 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617198 3 connected
-

The node 3c3a0c… now has two replicas, running on ports 7002 (the -existing one) and 7006 (the new one).

-

Remove a node

-

To remove a replica node just use the del-node command -of valkey-cli:

-
valkey-cli --cluster del-node 127.0.0.1:7000 `<node-id>`
-

The first argument is just a random node in the cluster, the second -argument is the ID of the node you want to remove.

-

You can remove a primary node in the same way as well, -however in order to remove a primary node it must be -empty. If the primary is not empty you need to reshard data -away from it to all the other primary nodes before.

-

An alternative to remove a primary node is to perform a manual -failover of it over one of its replicas and remove the node after it -turned into a replica of the new primary. Obviously this does not help -when you want to reduce the actual number of primaries in your cluster, -in that case, a resharding is needed.

-

There is a special scenario where you want to remove a failed node. -You should not use the del-node command because it tries to -connect to all nodes and you will encounter a “connection refused” -error. Instead, you can use the call command:

-
valkey-cli --cluster call 127.0.0.1:7000 cluster forget `<node-id>`
-

This command will execute CLUSTER FORGET command on -every node.

-

Replica migration

-

In Valkey Cluster, you can reconfigure a replica to replicate with a -different primary at any time just using this command:

-
CLUSTER REPLICATE <master-node-id>
-

However there is a special scenario where you want replicas to move -from one primary to another one automatically, without the help of the -system administrator. The automatic reconfiguration of replicas is -called replicas migration and is able to improve the -reliability of a Valkey Cluster.

-

Note: You can read the details of replicas migration -in the Valkey Cluster Specification, here -we’ll only provide some information about the general idea and what you -should do in order to benefit from it.

-

The reason why you may want to let your cluster replicas to move from -one primary to another under certain condition, is that usually the -Valkey Cluster is as resistant to failures as the number of replicas -attached to a given primary.

-

For example a cluster where every primary has a single replica can’t -continue operations if the primary and its replica fail at the same -time, simply because there is no other instance to have a copy of the -hash slots the primary was serving. However while net-splits are likely -to isolate a number of nodes at the same time, many other kind of -failures, like hardware or software failures local to a single node, are -a very notable class of failures that are unlikely to happen at the same -time, so it is possible that in your cluster where every primary has a -replica, the replica is killed at 4am, and the primary is killed at 6am. -This still will result in a cluster that can no longer operate.

-

To improve reliability of the system we have the option to add -additional replicas to every primary, but this is expensive. Replica -migration allows to add more replicas to just a few primaries. So you -have 10 primaries with 1 replica each, for a total of 20 instances. -However you add, for example, 3 instances more as replicas of some of -your primaries, so certain primaries will have more than a single -replica.

-

With replicas migration what happens is that if a primary is left -without replicas, a replica from a primary that has multiple replicas -will migrate to the orphaned primary. So after your replica -goes down at 4am as in the example we made above, another replica will -take its place, and when the primary will fail as well at 5am, there is -still a replica that can be elected so that the cluster can continue to -operate.

-

So what you should know about replicas migration in short?

-
    -
  • The cluster will try to migrate a replica from the primary that has -the greatest number of replicas in a given moment.
  • -
  • To benefit from replica migration you have just to add a few more -replicas to a single primary in your cluster, it does not matter what -primary.
  • -
  • There is a configuration parameter that controls the replica -migration feature that is called cluster-migration-barrier: -you can read more about it in the example valkey.conf file -provided with Valkey Cluster.
  • -
-

Upgrade nodes in a Valkey -Cluster

-

Upgrading replica nodes is easy since you just need to stop the node -and restart it with an updated version of Valkey. If there are clients -scaling reads using replica nodes, they should be able to reconnect to a -different replica if a given one is not available.

-

Upgrading primaries is a bit more complex. The suggested procedure is -to trigger a manual failover to turn the old primary into a replica and -then upgrading it.

-

A complete rolling upgrade of all nodes in a cluster can be performed -by repeating the following procedure for each shard (a primary and its -replicas):

-
    -
  1. Add one or more upgraded nodes as new replicas to the primary. -This step is optional but it ensures that the number of replicas is not -compromised during the rolling upgrade. To add a new node, use CLUSTER MEET and -CLUSTER REPLICATE -or use valkey-cli as described under Add a new node as a replica.

    -

    An alternative is to upgrade one replica at a time and have fewer -replicas online during the upgrade.

  2. -
  3. Upgrade the old replicas you want to keep by restarting them with -the updated version of Valkey. If you’re replacing all the old nodes -with new nodes, you can skip this step.

  4. -
  5. Select one of the upgraded replicas to be the new primary. Wait -until this replica has caught up the replication offset with the -primary. You can use INFO REPLICATION and check -for the line master_link_status:up to be present. This -indicates that the initial sync with the primary is complete.

    -

    After the initial full sync, the replica might still lag behind in -replication. Send INFO REPLICATION to the primary and the -replica and compare the field master_repl_offset returned -by both nodes. If the offsets match, it means that all writes have been -replicated. However, if the primary receives a constant stream of -writes, it’s possible that the offsets will never be equal. In this -step, you can accept a small difference. It’s usually enough to wait for -some seconds to minimize the difference.

  6. -
  7. Check that the new replica is known by all nodes in the cluster, -or at least by the primaries in the cluster. You can send CLUSTER NODES to -each of the nodes in the cluster and check that they all are aware of -the new node. Wait for some time and repeat the check if -necessary.

  8. -
  9. Trigger a manual failover by sending CLUSTER FAILOVER -to the replica node selected to become the new primary. See the Manual failover section in this document for -more information.

  10. -
  11. Wait for the failover to complete. To check, you can use ROLE, INFO REPLICATION (which -indicates role:master after successful failover) or CLUSTER NODES to -verify that the state of the cluster has changed shortly after the -command was sent.

  12. -
  13. Take the old primary (now a replica) out of service, or upgrade -it and add it again as a replica. Remove additional replicas kept for -redundancy during the upgrade, if any.

  14. -
-

Repeat this sequence for each shard (each primary and its replicas) -until all nodes in the cluster have been upgraded.

-

Migrate to Valkey Cluster

-

Users willing to migrate to Valkey Cluster may have just a single -primary, or may already using a preexisting sharding setup, where keys -are split among N nodes, using some in-house algorithm or a sharding -algorithm implemented by their client library or Valkey proxy.

-

In both cases it is possible to migrate to Valkey Cluster easily, -however what is the most important detail is if multiple-keys operations -are used by the application, and how. There are three different -cases:

-
    -
  1. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys, are not used. Keys are accessed independently (even if -accessed via transactions or Lua scripts grouping multiple commands, -about the same key, together).
  2. -
  3. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys are used but only with keys having the same hash -tag, which means that the keys used together all have a -{...} sub-string that happens to be identical. For example -the following multiple keys operation is defined in the context of the -same hash tag: SUNION {user:1000}.foo {user:1000}.bar.
  4. -
  5. Multiple keys operations, or transactions, or Lua scripts involving -multiple keys are used with key names not having an explicit, or the -same, hash tag.
  6. -
-

The third case is not handled by Valkey Cluster: the application -requires to be modified in order to not use multi keys operations or -only use them in the context of the same hash tag.

-

Case 1 and 2 are covered, so we’ll focus on those two cases, that are -handled in the same way, so no distinction will be made in the -documentation.

-

Assuming you have your preexisting data set split into N primaries, -where N=1 if you have no preexisting sharding, the following steps are -needed in order to migrate your data set to Valkey Cluster:

-
    -
  1. Stop your clients. No automatic live-migration to Valkey Cluster is -currently possible. You may be able to do it orchestrating a live -migration in the context of your application / environment.
  2. -
  3. Generate an append only file for all of your N primaries using the -BGREWRITEAOF command, and waiting for the AOF file to be -completely generated.
  4. -
  5. Save your AOF files from aof-1 to aof-N somewhere. At this point you -can stop your old instances if you wish (this is useful since in -non-virtualized deployments you often need to reuse the same -computers).
  6. -
  7. Create a Valkey Cluster composed of N primaries and zero replicas. -You’ll add replicas later. Make sure all your nodes are using the append -only file for persistence.
  8. -
  9. Stop all the cluster nodes, substitute their append only file with -your pre-existing append only files, aof-1 for the first node, aof-2 for -the second node, up to aof-N.
  10. -
  11. Restart your Valkey Cluster nodes with the new AOF files. They’ll -complain that there are keys that should not be there according to their -configuration.
  12. -
  13. Use valkey-cli --cluster fix command in order to fix -the cluster so that keys will be migrated according to the hash slots -each node is authoritative or not.
  14. -
  15. Use valkey-cli --cluster check at the end to make sure -your cluster is ok.
  16. -
  17. Restart your clients modified to use a Valkey Cluster aware client -library.
  18. -
-

There is an alternative way to import data from external instances to -a Valkey Cluster, which is to use the -valkey-cli --cluster import command.

-

The command moves all the keys of a running instance (deleting the -keys from the source instance) to the specified pre-existing Valkey -Cluster.

-

Note: If not for backward compatibility, the Valkey -project no longer uses the words “master” and “slave”. Unfortunately in -this command these words are part of the protocol, so we’ll be able to -remove such occurrences only when this API will be naturally -deprecated.

-

Learn more

- - - From 2b298fc3dd0754328678f0a29ba937a026cbabc0 Mon Sep 17 00:00:00 2001 From: avifenesh Date: Thu, 10 Jul 2025 04:58:50 +0000 Subject: [PATCH 10/25] Revise cluster tutorial structure and enhance command formatting for clarity Signed-off-by: avifenesh Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 91 +++++++++++++++++++++----------------- 1 file changed, 51 insertions(+), 40 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 19d5d78ab..047357cbc 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -191,18 +191,29 @@ in the `valkey.conf` file. To create and use a Valkey Cluster, follow these steps: -* [Create a Valkey Cluster](#create-a-valkey-cluster) -* [Interact with the cluster](#interact-with-the-cluster) -* [Write an example app with Valkey GLIDE](#write-an-example-app-with-valkey-glide) -* [Reshard the cluster](#reshard-the-cluster) -* [A more interesting example application](#a-more-interesting-example-application) -* [Test the failover](#test-the-failover) -* [Manual failover](#manual-failover) -* [Add a new node](#add-a-new-node) -* [Remove a node](#remove-a-node) -* [Replica migration](#replica-migration) -* [Upgrade nodes in a Valkey Cluster](#upgrade-nodes-in-a-valkey-cluster) -* [Migrate to Valkey Cluster](#migrate-to-valkey-cluster) +- [Valkey Cluster 101](#valkey-cluster-101) + - [Valkey Cluster TCP ports](#valkey-cluster-tcp-ports) + - [Valkey Cluster and Docker](#valkey-cluster-and-docker) + - [Valkey Cluster data sharding](#valkey-cluster-data-sharding) + - [Valkey Cluster primary-replica model](#valkey-cluster-primary-replica-model) + - [Valkey Cluster consistency guarantees](#valkey-cluster-consistency-guarantees) +- [Valkey Cluster configuration parameters](#valkey-cluster-configuration-parameters) +- [Create and use a Valkey Cluster](#create-and-use-a-valkey-cluster) + - [Requirements to create a Valkey Cluster](#requirements-to-create-a-valkey-cluster) + - [Create a Valkey Cluster](#create-a-valkey-cluster) + - [Interact with the cluster](#interact-with-the-cluster) + - [Write an example app with Valkey GLIDE](#write-an-example-app-with-valkey-glide) + - [Reshard the cluster](#reshard-the-cluster) + - [A more interesting example application](#a-more-interesting-example-application) + - [Test the failover](#test-the-failover) + - [Manual failover](#manual-failover) + - [Add a new node](#add-a-new-node) + - [Add a new node as a replica](#add-a-new-node-as-a-replica) + - [Remove a node](#remove-a-node) + - [Replica migration](#replica-migration) + - [Upgrade nodes in a Valkey Cluster](#upgrade-nodes-in-a-valkey-cluster) + - [Migrate to Valkey Cluster](#migrate-to-valkey-cluster) +- [Learn more](#learn-more) But, first, familiarize yourself with the requirements for creating a cluster. @@ -271,11 +282,11 @@ You can configure and execute individual instances manually or use the create-cl Let's go over how you do it manually. To create the cluster, run: - - valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \ - 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \ - --cluster-replicas 1 - +```bash +valkey-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \ +127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \ +--cluster-replicas 1 +``` The command used here is **create**, since we want to create a new cluster. The option `--cluster-replicas 1` means that we want a replica for every primary created. @@ -297,7 +308,7 @@ system (but you'll not learn the same amount of operational details). Find the `utils/create-cluster` directory in the Valkey distribution. There is a script called `create-cluster` inside (same name as the directory -it is contained into), it's a simple bash script. In order to start +it is contained into), it's a bash script. In order to start a 6 nodes cluster with 3 primaries and 3 replicas just type the following commands: @@ -322,7 +333,7 @@ See the documentation for your [client of choice](../clients/) to determine its You can also test your Valkey Cluster using the `valkey-cli` command line utility: -``` +```bash $ valkey-cli -c -p 7000 127.0.0.1:7000> set foo bar -> Redirected to slot [12182] located at 127.0.0.1:7002 @@ -354,7 +365,7 @@ changed the cluster layout by adding or removing nodes. Before going forward showing how to operate the Valkey Cluster, doing things like a failover, or a resharding, we need to create some example application -or at least to be able to understand the semantics of a simple Valkey Cluster +or at least to be able to understand the semantics of a Valkey Cluster client interaction. In this way we can run an example and at the same time try to make nodes @@ -363,8 +374,8 @@ world conditions. It is not very helpful to see what happens while nobody is writing to the cluster. This section explains some basic usage of -[Valkey GLIDE for Node.js](https://github.com/valkey-io/valkey-glide/tree/main/node), the official -Valkey client library, showing a simple example application. +[Valkey GLIDE for Node.js](https://github.com/valkey-io/valkey-glide/tree/main/node), an official +Valkey client library, available in numerous languages, showing a practical example application in Node.js. The following example demonstrates how to connect to a Valkey cluster and perform basic operations. First, install the Valkey GLIDE client: @@ -442,10 +453,10 @@ async function runExample() { runExample().catch(console.error); ``` -The application does a very simple thing, it sets keys in the form `foo` to `number`, using batched MSET operations for better performance. The MSET command accepts an array of alternating keys and values. So if you run the program the result is batches of MSET commands: +The application writes keys in the format `foo` with their corresponding numeric values, using batched `MSET` operations for better performance. The `MSET` command accepts an array of alternating keys and values. So if you run the program the result is batches of `MSET` commands: -* MSET foo1 1 foo2 2 foo3 3 ... foo100 100 (batch of 100 keys) -* MSET foo101 101 foo102 102 ... foo200 200 (next batch) +* `MSET foo1 1 foo2 2 foo3 3 ... foo100 100` (batch of 100 keys) +* `MSET foo101 101 foo102 102 ... foo200 200` (next batch) * And so forth... The program includes comprehensive error handling to display errors instead of @@ -456,7 +467,7 @@ Valkey cluster client using a list of cluster *addresses* and configuration opti including a request timeout and client name. The addresses don't need to be all the nodes of the cluster. The important -thing is that at least one node is reachable. Valkey GLIDE automatically +thing is that at least one node is reachable. A cluster-aware client, such as Valkey GLIDE automatically discovers the complete cluster topology once it connects to any node. Now that we have the cluster client instance, we can use it like any other @@ -466,7 +477,7 @@ The **counter initialization section** reads a counter so that when we restart t we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The **main processing loop** sets keys in batches using MSET operations +The **main processing loop** sets keys in batches using `MSET` operations for better performance, processing 100 keys at a time and displaying progress or any errors that occur. you'll get the usually 10k ops/second in the best of the conditions). @@ -504,7 +515,7 @@ Like cluster creation, it is accomplished using the valkey-cli utility. To start a resharding, just type: - valkey-cli --cluster reshard 127.0.0.1:7000 + `valkey-cli --cluster reshard 127.0.0.1:7000` You only need to specify a single node, valkey-cli will find the other nodes automatically. @@ -549,7 +560,7 @@ during the resharding if you want. At the end of the resharding, you can test the health of the cluster with the following command: - valkey-cli --cluster check 127.0.0.1:7000 + `valkey-cli --cluster check 127.0.0.1:7000` All the slots will be covered as usual, but this time the primary at 127.0.0.1:7000 will have more hash slots, something around 6461. @@ -558,7 +569,7 @@ Resharding can be performed automatically without the need to manually enter the parameters in an interactive way. This is possible using a command line like the following: - valkey-cli --cluster reshard : --cluster-from --cluster-to --cluster-slots --cluster-yes + `valkey-cli --cluster reshard : --cluster-from --cluster-to --cluster-slots --cluster-yes` This allows to build some automatism if you are likely to reshard often, however currently there is no way for `valkey-cli` to automatically @@ -574,7 +585,7 @@ Note that this option can also be activated by setting the #### A more interesting example application The example application we wrote early is not very good. -It writes to the cluster in a simple way without even checking if what was +It writes to the cluster in a straightforward way without even checking if what was written is the right thing. From our point of view the cluster receiving the writes could just always @@ -582,14 +593,14 @@ write the key `foo` to `42` to every operation, and we would not notice at all. Now we can write a more interesting application for testing cluster behavior. -A simple consistency checking application that uses a set of counters, by default 1000, and sends `INCR` commands to increment the counters. +A comprehensive consistency checking application that uses a set of counters, by default 1000, and sends `INCR` commands to increment the counters. However instead of just writing, the application does two additional things: * When a counter is updated using `INCR`, the application remembers the write. * It also reads a random counter before every write, and check if the value is what we expected it to be, comparing it with the value it has in memory. -What this means is that this application is a simple **consistency checker**, +What this means is that this application is a **consistency checker**, and is able to tell you if the cluster lost some write, or if it accepted a write that we did not receive acknowledgment for. In the first case we'll see a counter having a value that is smaller than the one we remember, while @@ -765,7 +776,7 @@ We'll show both, starting with the addition of a new primary instance. In both cases the first step to perform is **adding an empty node**. -This is as simple as to start a new node in port 7006 (we already used +This is as straightforward as starting a new node in port 7006 (we already used from 7000 to 7005 for our existing 6 nodes) with the same configuration used for the other nodes, except for the port number, so what you should do in order to conform with the setup we used for the previous nodes: @@ -824,8 +835,8 @@ having as a target the empty node. Adding a new replica can be performed in two ways. The obvious one is to use valkey-cli again, but with the --cluster-replica option, like this: - valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica - + `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica +` Note that the command line here is exactly like the one we used to add a new primary, so we are not specifying to which primary we want to add the replica. In this case, what happens is that valkey-cli will add the new @@ -834,8 +845,8 @@ node as replica of a random primary among the primaries with fewer replicas. However you can specify exactly what primary you want to target with your new replica with the following command line: - valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e - + `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e +` This way we assign the new replica to a specific primary. A more manual way to add a replica to a specific primary is to add the new @@ -866,7 +877,7 @@ The node 3c3a0c... now has two replicas, running on ports 7002 (the existing one To remove a replica node just use the `del-node` command of valkey-cli: - valkey-cli --cluster del-node 127.0.0.1:7000 `` + `valkey-cli --cluster del-node 127.0.0.1:7000 ``` The first argument is just a random node in the cluster, the second argument is the ID of the node you want to remove. @@ -884,7 +895,7 @@ There is a special scenario where you want to remove a failed node. You should not use the `del-node` command because it tries to connect to all nodes and you will encounter a "connection refused" error. Instead, you can use the `call` command: - valkey-cli --cluster call 127.0.0.1:7000 cluster forget `` + `valkey-cli --cluster call 127.0.0.1:7000 cluster forget ``` This command will execute `CLUSTER FORGET` command on every node. From 915431ab057b8f25126ab0a80175a5d911262944 Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Fri, 1 Aug 2025 16:40:34 +0300 Subject: [PATCH 11/25] Update topics/cluster-tutorial.md Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 047357cbc..ec4ee03ad 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -845,8 +845,7 @@ node as replica of a random primary among the primaries with fewer replicas. However you can specify exactly what primary you want to target with your new replica with the following command line: - `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e -` + `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e` This way we assign the new replica to a specific primary. A more manual way to add a replica to a specific primary is to add the new From 79add73b09e6c37c58267fa975d9c9203d009056 Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Fri, 1 Aug 2025 16:40:45 +0300 Subject: [PATCH 12/25] Update topics/cluster-tutorial.md Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index ec4ee03ad..3e750163f 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -835,8 +835,7 @@ having as a target the empty node. Adding a new replica can be performed in two ways. The obvious one is to use valkey-cli again, but with the --cluster-replica option, like this: - `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica -` + `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica` Note that the command line here is exactly like the one we used to add a new primary, so we are not specifying to which primary we want to add the replica. In this case, what happens is that valkey-cli will add the new From 83a24d29b994af9e3f543efc76c9c3c9c3974bd2 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 1 Aug 2025 13:48:15 +0000 Subject: [PATCH 13/25] Fix markdown formatting issues: tag code blocks correctly and fix nested backticks Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 3e750163f..06bc0778f 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -609,7 +609,7 @@ in the second case the value will be greater. Running a consistency testing application produces a line of output every second: -``` +```bash node consistency-test.js 925 R (0 err) | 925 W (0 err) | 5030 R (0 err) | 5030 W (0 err) | @@ -835,7 +835,7 @@ having as a target the empty node. Adding a new replica can be performed in two ways. The obvious one is to use valkey-cli again, but with the --cluster-replica option, like this: - `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica` +`valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica` Note that the command line here is exactly like the one we used to add a new primary, so we are not specifying to which primary we want to add the replica. In this case, what happens is that valkey-cli will add the new @@ -844,7 +844,7 @@ node as replica of a random primary among the primaries with fewer replicas. However you can specify exactly what primary you want to target with your new replica with the following command line: - `valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e` +`valkey-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000 --cluster-replica --cluster-master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e` This way we assign the new replica to a specific primary. A more manual way to add a replica to a specific primary is to add the new @@ -875,7 +875,7 @@ The node 3c3a0c... now has two replicas, running on ports 7002 (the existing one To remove a replica node just use the `del-node` command of valkey-cli: - `valkey-cli --cluster del-node 127.0.0.1:7000 ``` + `valkey-cli --cluster del-node 127.0.0.1:7000 ` The first argument is just a random node in the cluster, the second argument is the ID of the node you want to remove. @@ -893,7 +893,7 @@ There is a special scenario where you want to remove a failed node. You should not use the `del-node` command because it tries to connect to all nodes and you will encounter a "connection refused" error. Instead, you can use the `call` command: - `valkey-cli --cluster call 127.0.0.1:7000 cluster forget ``` + `valkey-cli --cluster call 127.0.0.1:7000 cluster forget ` This command will execute `CLUSTER FORGET` command on every node. From c858bf1190800197960e08a708ce8e8c22f76cd5 Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Fri, 1 Aug 2025 16:49:38 +0300 Subject: [PATCH 14/25] Apply suggestion from @avifenesh Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 06bc0778f..1821cc078 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -877,7 +877,7 @@ To remove a replica node just use the `del-node` command of valkey-cli: `valkey-cli --cluster del-node 127.0.0.1:7000 ` -The first argument is just a random node in the cluster, the second argument + `valkey-cli --cluster del-node 127.0.0.1:7000 ` is the ID of the node you want to remove. You can remove a primary node in the same way as well, **however in order to From c393a64acf2ccb8955eba463d715d67d7006e9db Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 1 Aug 2025 14:09:52 +0000 Subject: [PATCH 15/25] Apply suggestion from @avifenesh to improve section introduction Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 1821cc078..5557876cd 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -373,9 +373,8 @@ failing, or start a resharding, to see how Valkey Cluster behaves under real world conditions. It is not very helpful to see what happens while nobody is writing to the cluster. -This section explains some basic usage of -[Valkey GLIDE for Node.js](https://github.com/valkey-io/valkey-glide/tree/main/node), an official -Valkey client library, available in numerous languages, showing a practical example application in Node.js. +This section showcases the core functionality of Valkey through a practical Node.js application. +For our example, we will use the [Node.js version of GLIDE,](https://github.com/valkey-io/valkey-glide/tree/main/node) an official Valkey client library that supports multiple languages. The following example demonstrates how to connect to a Valkey cluster and perform basic operations. First, install the Valkey GLIDE client: From b538e31dad3030da207f12fa996c339fadb5ddc8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:39:48 +0000 Subject: [PATCH 16/25] Initial plan Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> From 551cb2ea28c1eb5701429b054be263b3f367f58a Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Tue, 5 Aug 2025 10:29:27 +0300 Subject: [PATCH 17/25] add missing words to wordlist Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- wordlist | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/wordlist b/wordlist index a5fc4a0c0..c90fede67 100644 --- a/wordlist +++ b/wordlist @@ -1056,3 +1056,13 @@ pre-allocation Valkey-Search's Chatbots GenAI +embeddings +expirations +Fintech +Json +namespaced +namespaces +pluggable +precomputed +Try-Valkey +uptime From 6113ff47a6da2332515476cbc85673ceb77974f4 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:39:48 +0000 Subject: [PATCH 18/25] Initial plan Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> From fb7fab9db3795cfc0af5e6b4202edf4f0183133d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 8 Jul 2025 21:39:48 +0000 Subject: [PATCH 19/25] Initial plan Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> From b527e88ae34535be352fcd47b5c942ed2b9e7a5f Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Tue, 5 Aug 2025 10:33:07 +0300 Subject: [PATCH 20/25] fix: ensure newline at end of file in wordlist Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- wordlist | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/wordlist b/wordlist index c90fede67..f73fd6c1c 100644 --- a/wordlist +++ b/wordlist @@ -1065,4 +1065,4 @@ namespaces pluggable precomputed Try-Valkey -uptime +uptime \ No newline at end of file From 67199c6494df166dc1fa343606bb34cfff77b8cf Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 5 Aug 2025 13:07:46 +0000 Subject: [PATCH 21/25] Replace MSET batching with individual SET operations to avoid cross-slot issues Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 42 +++++++++----------------------------- 1 file changed, 10 insertions(+), 32 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 5557876cd..8dd56726e 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -414,32 +414,19 @@ async function runExample() { console.log(`Starting from counter: ${last}`); - // Write keys in batches using mset for better performance - const batchSize = 100; - for (let start = last + 1; start <= 1000000000; start += batchSize) { + // Write keys sequentially with individual SET operations + for (let x = last + 1; x <= 1000000000; x++) { try { - const keyValuePairs = []; - const end = Math.min(start + batchSize - 1, 1000000000); + await client.set(`foo${x}`, x.toString()); - // Prepare batch of key-value pairs as array - for (let x = start; x <= end; x++) { - keyValuePairs.push(`foo${x}`, x.toString()); + // Update counter every 1000 operations and display progress + if (x % 1000 === 0) { + await client.set("__last__", x.toString()); + console.log(`Progress: ${x} keys written`); } - // Execute batch mset with array format - await client.mset(keyValuePairs); - - // Update counter and display progress - await client.set("__last__", end.toString()); - console.log(`Batch completed: ${start} to ${end}`); - - // Verify a sample key from the batch - const sampleKey = `foo${start}`; - const value = await client.get(sampleKey); - console.log(`Sample verification - ${sampleKey}: ${value}`); - } catch (error) { - console.log(`Error in batch starting at ${start}: ${error.message}`); + console.log(`Error writing key foo${x}: ${error.message}`); } } } catch (error) { @@ -452,11 +439,7 @@ async function runExample() { runExample().catch(console.error); ``` -The application writes keys in the format `foo` with their corresponding numeric values, using batched `MSET` operations for better performance. The `MSET` command accepts an array of alternating keys and values. So if you run the program the result is batches of `MSET` commands: - -* `MSET foo1 1 foo2 2 foo3 3 ... foo100 100` (batch of 100 keys) -* `MSET foo101 101 foo102 102 ... foo200 200` (next batch) -* And so forth... +The application writes keys in the format `foo` with their corresponding numeric values using individual `SET` operations. This approach ensures compatibility with cluster deployments where keys may be distributed across different nodes based on their hash slots. The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks. @@ -476,12 +459,7 @@ The **counter initialization section** reads a counter so that when we restart t we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The **main processing loop** sets keys in batches using `MSET` operations -for better performance, processing 100 keys at a time and displaying progress or -any errors that occur. -you'll get the usually 10k ops/second in the best of the conditions). - -you'll get optimal performance). +The **main processing loop** sets keys sequentially using individual `SET` operations, updating progress every 1000 keys and displaying any errors that occur. Starting the application produces the following output: From ba2d5c59186c72c04ebc00718b961c7c02682a56 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 6 Aug 2025 20:52:34 +0000 Subject: [PATCH 22/25] Update example output to match individual SET operations Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 8dd56726e..595efc3f5 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -467,12 +467,9 @@ Starting the application produces the following output: node example.js Connected to Valkey cluster Starting from counter: 0 -Batch completed: 1 to 100 -Sample verification - foo1: 1 -Batch completed: 101 to 200 -Sample verification - foo101: 101 -Batch completed: 201 to 300 -Sample verification - foo201: 201 +Progress: 1000 keys written +Progress: 2000 keys written +Progress: 3000 keys written ^C (I stopped the program here) ``` From 34234721102c9b61f3c9bd3a5fe3787b3ce97ed2 Mon Sep 17 00:00:00 2001 From: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> Date: Thu, 14 Aug 2025 20:13:07 +0300 Subject: [PATCH 23/25] Update topics/cluster-tutorial.md Co-authored-by: Madelyn Olson Signed-off-by: Avi Fenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 595efc3f5..ee41c75eb 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -398,7 +398,7 @@ async function runExample() { // Check `GlideClientConfiguration/GlideClusterClientConfiguration` for additional options. const client = await GlideClusterClient.createClient({ addresses: addresses, - // if the cluster nodes use TLS, you'll need to enable it. Otherwise the connection attempt will time out silently. + // if the cluster nodes use TLS, you'll need to enable it. // useTLS: true, // It is recommended to set a timeout for your specific use case requestTimeout: 500, // 500ms timeout From 3db73f986f87dacd935d45fe894e7e1e3871a940 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 14 Aug 2025 17:25:28 +0000 Subject: [PATCH 24/25] Restore original Ruby logic in Node.js implementation and prepare follow-up issue Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 51 ++++++++++++++++++++++++++------------ 1 file changed, 35 insertions(+), 16 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index ee41c75eb..8edbe877a 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -409,24 +409,37 @@ async function runExample() { console.log("Connected to Valkey cluster"); // Get the last counter value, or start from 0 - let last = await client.get("__last__"); - last = last ? parseInt(last) : 0; + let last = false; + while (!last) { + try { + last = await client.get("__last__"); + last = last || 0; + } catch (error) { + console.log(`Error getting counter: ${error.message}`); + await new Promise(resolve => setTimeout(resolve, 1000)); + } + } console.log(`Starting from counter: ${last}`); - // Write keys sequentially with individual SET operations - for (let x = last + 1; x <= 1000000000; x++) { + // Write keys sequentially with verification, following original Ruby logic + for (let x = parseInt(last) + 1; x <= 1000000000; x++) { try { + // Set the key await client.set(`foo${x}`, x.toString()); - // Update counter every 1000 operations and display progress - if (x % 1000 === 0) { - await client.set("__last__", x.toString()); - console.log(`Progress: ${x} keys written`); - } + // Get and verify the value + const value = await client.get(`foo${x}`); + console.log(value); + + // Update the counter + await client.set("__last__", x.toString()); + + // Add delay equivalent to Ruby's sleep 0.1 + await new Promise(resolve => setTimeout(resolve, 100)); } catch (error) { - console.log(`Error writing key foo${x}: ${error.message}`); + console.log(`Error: ${error.message}`); } } } catch (error) { @@ -439,7 +452,7 @@ async function runExample() { runExample().catch(console.error); ``` -The application writes keys in the format `foo` with their corresponding numeric values using individual `SET` operations. This approach ensures compatibility with cluster deployments where keys may be distributed across different nodes based on their hash slots. +The application writes keys in the format `foo` to `number`, one after the other, following the same logic as the original Ruby example. For each key, it performs a SET operation, immediately verifies the value with a GET operation, updates a counter, and includes a small delay between operations. The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks. @@ -455,11 +468,11 @@ discovers the complete cluster topology once it connects to any node. Now that we have the cluster client instance, we can use it like any other Valkey client to perform operations across the cluster. -The **counter initialization section** reads a counter so that when we restart the example +The **counter initialization section** reads a counter with retry logic so that when we restart the example we don't start again with `foo0`, but continue from where we left off. The counter is stored in Valkey itself using the key `__last__`. -The **main processing loop** sets keys sequentially using individual `SET` operations, updating progress every 1000 keys and displaying any errors that occur. +The **main processing loop** sets keys sequentially, immediately verifies each value with a GET operation, updates the counter after each key, and includes a 100ms delay between operations to match the original Ruby example's timing. Starting the application produces the following output: @@ -467,9 +480,15 @@ Starting the application produces the following output: node example.js Connected to Valkey cluster Starting from counter: 0 -Progress: 1000 keys written -Progress: 2000 keys written -Progress: 3000 keys written +1 +2 +3 +4 +5 +6 +7 +8 +9 ^C (I stopped the program here) ``` From 9420d95723a9edd4ba2b06193f88bd71034b6cc2 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 14 Aug 2025 17:31:24 +0000 Subject: [PATCH 25/25] Remove Ruby references from Node.js example comments and descriptions Co-authored-by: avifenesh <55848801+avifenesh@users.noreply.github.com> --- topics/cluster-tutorial.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/topics/cluster-tutorial.md b/topics/cluster-tutorial.md index 8edbe877a..1af47aea3 100644 --- a/topics/cluster-tutorial.md +++ b/topics/cluster-tutorial.md @@ -422,7 +422,7 @@ async function runExample() { console.log(`Starting from counter: ${last}`); - // Write keys sequentially with verification, following original Ruby logic + // Write keys sequentially with verification for (let x = parseInt(last) + 1; x <= 1000000000; x++) { try { // Set the key @@ -452,7 +452,7 @@ async function runExample() { runExample().catch(console.error); ``` -The application writes keys in the format `foo` to `number`, one after the other, following the same logic as the original Ruby example. For each key, it performs a SET operation, immediately verifies the value with a GET operation, updates a counter, and includes a small delay between operations. +The application writes keys in the format `foo` to `number`, one after the other. For each key, it performs a SET operation, immediately verifies the value with a GET operation, updates a counter, and includes a small delay between operations. The program includes comprehensive error handling to display errors instead of crashing, so all cluster operations are wrapped in try-catch blocks.