Dragonfly

Understanding Redis Performance and 6 Ways to Boost Performance

Redis uses a memory-first architecture, combined with a simple protocol and optimized binary encoding, which lets it achieve more than 100k queries per second.

July 1, 2025

cover

What is Redis and How Does It Deliver High Performance?

Redis is an in-memory key-value data store designed for low-latency, high-throughput workloads. Unlike traditional databases that rely on disk-based storage, Redis keeps all data in RAM, which enables sub-millisecond response times for both read and write operations. This memory-first architecture, combined with a simple protocol and optimized binary encoding, enables Redis to achieve more than 100,000 queries per second. This makes Redis ideal for use cases like caching, real-time analytics, session management, and message brokering.

In addition to fast memory access, Redis supports a variety of purpose-built data structures such as strings, hashes, lists, sets, and sorted sets, each tailored for specific patterns of data access. These structures are implemented using highly efficient algorithms, ensuring consistent performance even under heavy load. 

Redis also provides features like atomic operations, transactions via MULTI/EXEC, and support for Lua scripting, enabling complex logic to be executed server-side with minimal overhead. Additionally, Redis offers persistence options to support data durability and recovery after restarts. These capabilities make Redis a versatile tool for building responsive, scalable, and resilient applications.

Understanding the Redis Threading Model

Although Redis is often described as single-threaded, this only applies to how it handles core data structure operations. In reality, Redis employs a hybrid model where the main thread manages most command processing, while auxiliary threads are used for specific tasks.

For day-to-day workloads—like reading and writing data—the main thread handles all requests. This design simplifies the internal logic, as there’s no need for thread synchronization or locking mechanisms, which are typically required in multi-threaded environments. However, Redis does offload certain tasks to other threads. Below are a few examples:

  • Starting in version 4.0, Redis introduced asynchronous memory release (via the UNLINK command) to prevent large key deletions from blocking the main thread. 
  • Redis 6.0 added support for multi-threaded I/O to reduce pressure on the core processing thread in high-concurrency scenarios.
  • Persistence tasks like snapshotting and AOF rewriting happen in background threads as well since they are generally more time-consuming and unsuitable for the main thread.

This threading model strikes a balance between simplicity and performance: Redis avoids the complexity and contention of multi-threaded data handling while still leveraging auxiliary threads for less critical tasks to keep the system responsive.

Why is Redis So Fast Despite Being Single-Threaded?

Redis achieves high throughput—up to hundreds of thousands of operations per second on modern, powerful servers—largely due to its design choices around memory, data structures, and I/O handling. The main factors that make Redis fast are:

  • Data resides in memory: This eliminates the latency associated with disk reads and writes, allowing operations to complete almost instantaneously. Redis also uses efficient data structures like hash tables, lists, and sorted sets, optimized for in-memory access patterns. Most operations run in constant or near-constant time.
  • Redis uses I/O multiplexing: This allows a single thread to handle thousands of simultaneous client connections. It relies on system-level event handling mechanisms like epoll, kqueue, or select to monitor sockets and respond to activity without blocking. This lets Redis serve multiple clients efficiently with minimal CPU overhead.
  • Redis avoids CPU-intensive tasks and uses a simple, lock-free execution model. This minimizes context switching and synchronization delays. 
  • Redis Cluster: When additional performance is needed, scaling horizontally with Redis Cluster is preferred over introducing complex multi-threaded processing within a single instance.

Testing Redis Performance with redis-benchmark

The redis-benchmark utility is a built-in tool used to measure the performance of a Redis instance. It works by simulating concurrent clients sending a configurable number of requests, allowing developers to understand throughput and latency under various conditions. These instructions below are adapted from the Redis documentation.

Note: In this tutorial we use the simple, built-in redis-benchmark tool, but for realistic scenarios, it is recommended to use memtier_benchmark or YCSB, which are more advanced.

Basic Usage

To run a simple benchmark, ensure a Redis server is running, then execute:

redis-benchmark -q -n 100000

This command performs 100,000 requests using default settings and prints the number of requests per second for each test. By default, 50 clients are used in parallel, targeting the localhost on port 6379.

Running Specific Tests

You can limit the benchmark to specific commands using the -t option. For example:

redis-benchmark -t set,lpush -n 100000 -q

This runs 100,000 SET and LPUSH commands in quiet mode, showing results for those operations only.

Custom Keyspace and Randomization

To simulate a more realistic workload with varied keys, use the -r flag:

redis-benchmark -t set -r 100000 -n 1000000

This executes 1,000,000 SET operations using keys randomized across 100,000 possible values, stressing Redis’s performance under cache-miss scenarios.

Using Pipelining

Redis supports pipelining, where multiple commands are sent in a single request, minimizing network latency:

redis-benchmark -n 1000000 -t set,get -P 16 -q

Here, each client sends 16 commands at once, drastically improving throughput, which can even achieve over a million operations per second depending on the hardware.

redis-benchmark Best Practices and Pitfalls

  • Avoid false comparisons: Comparing Redis with systems that don’t account for network latency or use different protocols may lead to misleading results.
  • Use multiple connections: Realistic tests should involve concurrency—single-threaded or naive client-side tests mostly measure latency, not Redis performance.
  • Understand limitations: By default, redis-benchmark doesn’t use pipelining or client-side multi-threading unless explicitly specified. This means the results might represent a lower-bound estimate of Redis’s potential.
  • Match real workloads: Use pipeline sizes and data sizes similar to your application to get realistic benchmarks.

Hardware Considerations

Fast CPUs with large caches, low-latency networks, and running on bare metal (instead of VMs) all contribute to better benchmark results. Additionally, using Unix domain sockets instead of TCP/IP can boost local throughput.

It is not a good idea to deploy Redis in a standalone setup on a very powerful machine, because a standalone Redis instance cannot fully utilize multiple cores. Instead, Redis Cluster should be considered, even on a single multi-core machine.

Best Practices for Optimizing Redis Performance

1. Memory Management

Redis operates mostly in memory, so efficient memory usage directly impacts both performance and stability. Set the maxmemory directive to cap memory consumption and prevent Redis from exhausting system RAM. Use eviction policies like allkeys-lru or volatile-lfu to remove less recently or frequently used keys when memory limits are reached.

To monitor memory use, regularly check metrics like used_memory_rss, mem_fragmentation_ratio, and run MEMORY STATS and MEMORY DOCTOR to diagnose inefficiencies. Fragmentation above 1.5x often signals issues in memory allocation or release. Also, compact data types (e.g., small hashes, sets, and lists) use listpack or ziplist encodings, which are more memory-efficient—these are enabled automatically under certain thresholds but can be tuned via configuration (e.g., hash-max-listpack-entries, set-max-listpack-entries).

Avoid storing large blobs (e.g., media files) or deeply nested structures. Instead, use external object storage (like S3) and cache metadata or access tokens in Redis.

2. Data Structure Optimization

Choosing the right data structure for a task can reduce both memory and CPU use. For example:

  • Hashes: Ideal for storing multiple fields under one key, reducing key overhead. Works well for user profiles or session data.
  • Lists: Suitable for simple queues or ordered logs, but performance degrades with length. Use LPUSH/RPOP or RPUSH/LPOP for efficient operations.
  • Sets and Sorted Sets: Powerful for uniqueness and ranking logic, but require careful memory budgeting. Sorted sets are CPU-intensive and grow quickly in memory usage if scores and values are long or numerous.
  • Bitmaps and HyperLogLogs: Ideal for efficiently tracking unique events, user activity, or boolean flags while minimizing memory usage.

Use pipelining when writing large batches to reduce network round-trips. Reevaluate data model choices periodically, especially as usage patterns evolve.

3. Persistence Configuration

Redis supports two main persistence mechanisms:

  1. RDB Snapshot: Periodically saves the dataset to disk. Fast and space-efficient but risks data loss if Redis crashes between saves. Controlled via the save directives.
  2. AOF (Append Only File): Logs every write operation. Safer for durability but may use more disk space and I/O. Controlled via appendfsync (options: always, everysec, no).

In write-heavy environments, appendfsync everysec is a good compromise, offering durability with balanced I/O impact. For large datasets, use both RDB and AOF together to speed up restarts while maintaining persistence.

Enable lazyfree options for deletion-heavy workloads to offload memory cleanup to background threads (lazyfree-lazy-eviction, lazyfree-lazy-expire, etc.), reducing latency spikes during key deletions.

4. Network and Client Configuration

Redis is optimized for high-throughput, low-latency communication, but misconfigured networks or clients can create bottlenecks. Configure:

  • tcp-backlog: Increases queue length for pending connections.
  • client-output-buffer-limit: Prevents slow clients from exhausting memory. Set custom thresholds for different client types (normal, pubsub, and replica).

For local deployments, use Unix domain sockets (via unixsocket) for lower latency than TCP/IP. On the client side, use connection pools and persistent connections. Avoid frequent reconnects that waste time on handshakes.

5. Command and Query Optimization

Redis commands vary widely in time complexity. Favor O(1) or O(log N) commands for high-throughput paths. Avoid operations like:

  • KEYS *, SCAN with large ranges.
  • ZRANGE or SMEMBERS on large sets without limits.
  • Frequent use of DEL or FLUSHALL during peak loads.

Use Lua scripts with EVALSHA for complex atomic operations—this reduces network round-trips and avoids race conditions. When modifying multiple keys, batch operations together using multi-key commands (i.e., MSET, MGET) or pipelines. Regularly profile slow commands using SLOWLOG to identify bottlenecks. Use Redis modules or application-level abstractions to encapsulate expensive logic, moving it server-side where appropriate.

6. Scaling and High Availability

Redis hits its limit of vertical scaling fast, and horizontal scaling is often needed for large or distributed applications. Use the following:

  • Sentinel: Provides automatic failover, monitoring, and read-replicas. It’s simple but limited to single-master replication setups.
  • Redis Cluster: Enables horizontal scaling via sharding. Data is partitioned across nodes using hash slots. Establish key naming conventions to achieve even slot distribution—by using {hashtags} to co-locate related keys.

For read scaling, add replicas (REPLICAOF) and direct read-only queries to them. Monitor replication lag to ensure data freshness.

Set min-replicas-to-write and min-replicas-max-lag to ensure writes are only accepted if enough replicas are within an acceptable lag threshold, reducing the possibility of data loss in case of primary failure.

For more advanced needs, Redis Enterprise or third-party tools can offer active-active replication, multi-region support, and strong eventual consistency via CRDTs.


Dragonfly: The Next-Generation In-Memory Data Store

Dragonfly is a modern, source-available, multi-threaded, Redis-compatible in-memory data store that stands out by delivering unmatched performance and efficiency. Designed from the ground up to disrupt existing legacy technologies, Dragonfly redefines what an in-memory data store can achieve. With Dragonfly, you get the familiar API of Redis without the performance bottlenecks, making it an essential tool for modern cloud architectures aiming for peak performance and cost savings. Migrating from Redis to Dragonfly requires zero or minimal code changes.

Key Advancements of Dragonfly

  • Multi-Threaded Architecture: Efficiently leverages modern multi-core processors to maximize throughput and minimize latency.
  • Unmatched Performance: Achieves 25x better performance than Redis, ensuring your applications run with extremely high throughput and consistent latency.
  • Cost Efficiency: Reduces hardware and operational costs without sacrificing performance, making it an ideal choice for budget-conscious enterprises.
  • Redis API Compatibility: Offers seamless integration with existing Redis applications and frameworks while overcoming its limitations.
  • Innovative Design: Built to scale vertically and horizontally, providing a robust solution for rapidly growing data needs.

Dragonfly Cloud is a fully managed service from the creators of Dragonfly, handling all operations and delivering effortless scaling so you can focus on what matters without worrying about in-memory data infrastructure anymore.

Was this content helpful?

Help us improve by giving us your feedback.

Dragonfly Wings

Stay up to date on all things Dragonfly

Join our community for unparalleled support and insights

Join

Switch & save up to 80%

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost