Migrating from a Redis Cluster to Dragonfly on a single node
In this blog post, you will learn how to migrate data from a Redis Cluster to a single-node Dragonfly instance.
May 9, 2023
Introduction
At its core, Redis is a single-threaded system. To scale horizontally, you need to use a deployment topology called Redis Cluster that automatically partitions data across multiple Redis nodes. Instead of consistent hashing (which is a commonly used technique in other systems), Redis employs a different form of sharding where every key is conceptually part of a hash slot.
Once you move from a single Redis instance to a Redis Cluster, you enter the world of distributed systems. It has its pros but there are quite a few limitations as well:
- Shard/Node management - This includes adding/removing nodes, rebalancing data, and so on.
- Dealing with eventual consistency and possible data loss - Redis replication is asynchronous, which can lead to stale reads from replicas. Even worse, there can be data loss due to primary node failures and split-brain scenarios.
- Application level complexities - Redis Cluster does not support multi-key operations on keys that belong to different slots. Doing so results in the (in)famous
CROSSSLOT
error.
Just like "The Best Code is No Code At All", perhaps the ideal distributed system is one that doesn't need to be distributed. In the case of Redis, you can achieve this by using a single-node solution like Dragonfly.
Thanks to its API compatibility, Dragonfly can act as a drop-in replacement for Redis. Dragonfly addresses most of the shortcomings of Redis Cluster, thereby dramatically reducing operational complexity and improving reliability.
- Dragonfly relies on vertically scaling a single node. This makes it quicker, cheaper, and more predictable than Redis Cluster and other multi-node approaches.
- Unlike Redis, Dragonfly has a multi-threaded, shared-nothing architecture that can scale vertically up to
1TB
of memory on each instance.
In this blog post, you will learn how to migrate data from a Redis Cluster to a single-node Dragonfly instance. We will use Dragonfly's emulated cluster mode that makes it easier to migrate your existing application(s) from a Redis Cluster to Dragonfly. We will use a sample application to demonstrate the migration process and cover everything step by step.
Prerequisites
Before you begin, make sure you have the following installed:
- Docker and Docker Compose
- Redis CLI
- RIOT (for migration)
- Go programming language (version 1.18 or above)
Clone the repository and navigate to the correct directory:
git clone git@github.com:dragonflydb/redis-cluster-application-example.git
cd redis-cluster-application-example
Start Dragonfly and Redis Cluster
To keep things simple, you will use Docker Compose:
docker compose -p migrate-to-dragonflydb up
Verify that the Docker containers are running:
docker compose -p migrate-to-dragonflydb ps
Expected output:
NAME COMMAND SERVICE STATUS PORTS
migrate-to-dragonflydb-dragonfly-1 "entrypoint.sh drago…" dragonfly running (healthy) 0.0.0.0:6380->6379/tcp
migrate-to-dragonflydb-rediscluster-1 "docker-entrypoint.s…" rediscluster running 6379/tcp, 0.0.0.0:7000-7002->7000-7002/tcp
From the Redis container, start a Redis Cluster:
docker exec -it migrate-to-dragonflydb-rediscluster-1 /bin/bash
./start-cluster.sh
# expected output
Redis Cluster ready
At this point, you should have:
- A Dragonfly instance accessible on port
6380
- A three-node Redis Cluster accessible on ports
7000
,7001
, and7002
Before starting the migration process, let's quickly take a look at the sample application.
Application Overview
You can access the source code here under /app
.
The application is a web service written in Go that serves data from Redis with the Go Redis client. The application loads sample data into the Redis Cluster during startup - this is dummy user data stored as a HASH
.
// main.go
func loadData() {
fmt.Println("loading sample data into redis.....")
user := map[string]string{}
for i := 0; i < 100; i++ {
key := "user:" + strconv.Itoa(i)
name := "user-" + strconv.Itoa(i)
email := name + "@foo.com"
user["name"] = name
user["email"] = email
err := client.HMSet(context.Background(), key, user).Err()
if err != nil {
log.Fatal("failed to load data", err)
}
}
fmt.Println("data load complete")
}
It also provides a REST API to fetch user data given its ID.
func main() {
r := mux.NewRouter()
r.HandleFunc("/{id}", func(w http.ResponseWriter, r *http.Request) {
id := mux.Vars(r)["id"]
hashName := "user:" + id
fmt.Println("getting data for", hashName)
var user User
err := client.HMGet(context.Background(), hashName, "name", "email").Scan(&user)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
err = json.NewEncoder(w).Encode(user)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
})
log.Fatal(http.ListenAndServe(":8080", r))
}
Once the application is running, you can use any HTTP
client (curl
for example) to query data from Redis using the API exposed by the application - more on this later.
This blog post makes use of RIOT, which is a data migration tool for Redis. Instead of using the REPLICAOF
command, it implements client-side replication using DUMP
& RESTORE
, or type-based replication (e.g. GET
and SET
) for migration in snapshot
or live
modes.
Preparing for the migration
1. Start the client application
In a new terminal:
cd app
export REDIS_HOSTS="localhost:7000,localhost:7001,localhost:7002"
export LOAD_DATA=true
go run main.go
The application will load seed data (100 HASH
es) into the source Redis Cluster. You should see the following output:
loading sample data into redis.....
data load complete
2. Verify data in source Redis Cluster
Once the data load is complete, connect to each node in the Redis Cluster and check the number of keys:
redis-cli -p 7000 -c DBSIZE
# output
(integer) 28
redis-cli -p 7001 -c DBSIZE
# output
(integer) 36
redis-cli -p 7002 -c DBSIZE
# output
(integer) 36
If you add them up, they should be 100
- same as what our application loaded during startup.
You can test the application using curl
:
curl -X GET http://localhost:8080/1
# expected output
{"name":"user-1","email":"user-1@foo.com"}
curl -X GET http://localhost:8080/2
# expected output
{"name":"user-2","email":"user-2@foo.com"}
You have verified that the application is working as expected.
Migrate data from Redis Cluster to a single node Dragonfly instance (in Cluster Mode)
1. Verify data in Dragonfly
redis-cli -p 6380 -c DBSIZE
# expected output
(integer) 0
As expected, there is no data in the Dragonfly instance yet.
2. Use the RIOT tool to initiate the migration
Use the following riot
command:
riot --info -h localhost -p 7000 --cluster replicate-ds -h localhost -p 6380 --batch 10
We are using Redis Cluster as the source (note the port number and --cluster
flag). This will migrate data from Redis Cluster node localhost:7000
to Dragonfly instance localhost:6380
. The --batch
flag specifies the number of keys to be migrated in a single batch.
You should see similar output from the migration job (timings and other statistics might differ):
Executing step: [snapshot-replication]
Job: [SimpleJob: [name=scan-reader]] launched with the following parameters: [{}]
Executing step: [scan-reader]
Scanning 0% │ │ 0/10 (0:00:00 / ?) ?/s
Step: [scan-reader] executed in 79ms
Closing with items still in queue
Job: [SimpleJob: [name=scan-reader]] completed with the following parameters: [{}] and the following status: [COMPLETEScanning 100% │██████████████████████████████████████████████████████████████│ 100/100 (0:00:00 / 0:00:00) ?/s
Step: [snapshot-replication] executed in 856ms
Executing step: [verification]
Job: [SimpleJob: [name=RedisItemReader]] launched with the following parameters: [{}]
Executing step: [RedisItemReader]
Verifying 0% │ │ 0/10 (0:00:00 / ?) ?/s
Step: [RedisItemReader] executed in 147ms
Closing with items still in queue
Job: [SimpleJob: [name=RedisItemReader]] completed with the following parameters: [{}] and the following status: [COMPVerifying 100% │██████████████████████████████│ 100/100 (0:00:00 / 0:00:00) ?/s >0 T0 ≠0 ⧗0
Verification completed - all OK
Step: [verification] executed in 368ms
Job: [SimpleJob: [name=snapshot-replication]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 1s443ms
3. Verify data in Dragonfly
redis-cli -p 6380 -c DBSIZE
# expected output
(integer) 100
redis-cli -p 6380 -c HMGET user:42 name email
# expected output
1) "user-42"
2) "user-42@foo.com"
If the migration was successful, you should see 100
items in the Dragonfly instance.
4. Switch the application to use Dragonfly
Before you start the application, shut down the previous instance to avoid port 8080
collision
export REDIS_HOSTS="localhost:6380"
export LOAD_DATA=false
go run main.go
# expected output
connected to redis localhost:6380
Notice that we changed the REDIS_HOSTS
value to point to Dragonfly and we are not using the data load option anymore. This will just start the web server and serve data from Dragonfly.
You can test it like this:
curl -X GET http://localhost:8080/4
curl -X GET http://localhost:8080/5
curl -X GET http://localhost:8080/42
The application will continue to work as expected and serve data from Dragonfly.
All we did was point to Dragonfly instead of Redis Cluster with no code changes required.
Once you have completed the steps in this tutorial, use this command to stop the Redis Cluster and Dragonfly instance:
docker compose -p migrate-to-dragonflydb down -v
Conclusion and next steps
In this tutorial, we used the RIOT tool to migrate data from a Redis Cluster to Dragonfly. without making any application code changes! Thanks to the Dragonfly Cluster Mode, a single Dragonfly instance can achieve the same capacity as a multi-node Redis Cluster. The migration technique you would use depends on the source cluster. This could vary from a REPLICAOF
command (Dragonfly supports a primary/secondary replication model), the MIGRATE
command, or even a custom application.
Due to Dragonfly's hardware efficiency, you can run a single node instance on a small 8GB
instance and scale vertically to large 768GB
machines with 64
cores. This reduces infrastructure costs and decreases the complexity that's inherent in distributed data systems. Rather than spending your time and effort maintaining a Redis Cluster, you can use Dragonfly, which provides the same semantics as single-node Redis, along with the ability to emulate a Redis Cluster if required. At the time of writing, Dragonfly has implemented more than 200 Redis commands, which represents good coverage for the vast majority of use cases.
If you want to learn more, check out our documentation on how to fire up a Dragonfly instance, bring along a client of your choice and start building with familiar Redis commands!