Dragonfly SSD Data Tiering: Caching All the Way Down

What is Dragonfly SSD Data Tiering?

In Dragonfly v1.21, we introduced an exciting new alpha feature: SSD Data Tiering. This feature allows Dragonfly to offload part of its in-memory values, specifically those of the String data type, to fast SSD storage. By doing so, it intelligently manages memory usage, keeping frequently accessed hot data in memory while moving less-used data to disk. This helps optimize valuable memory resources even further without compromising on performance—high throughput and sub-millisecond average latency are still maintained. This advanced feature makes it an ideal choice for applications with large datasets.

Dragonfly SSD Data Tiering: Intelligently Offloading Data from Memory to SSD

Now that we know what SSD tiering is, let's explore why this feature is crucial and how you can take advantage of it in your applications.

Why SSD Tiering Matters

Just like the famous phrase "turtles all the way down", Dragonfly SSD tiering adds a new layer to create an even more cost-efficient storage solution. There are other popular and interesting sayings in computing as well:

All problems in computer science can be solved by another level of indirection, such as caching.

Caches are so critical to computer systems that it sometimes seems like caching is the only performance-improving idea in systems.

These ideas highlight how different layers of caching in both hardware and software can help optimize performance and resource utilization. For example:

CPUs themselves have multiple cache layers (L1, L2, L3, etc.) to quickly access frequently used data without reaching out to the slower main memory.
The main memory (RAM) acts as a caching layer between the CPUs and the disk, providing fast access to data without the need to read from the slower disks.
In-memory data stores serve as a caching layer for backend services, bridging the gap between slower, disk-based storage systems and fast RAM.

Memory & Storage Hierarchy (Brown University, CS 131, 2020)

Before the release of the SSD tiering feature in Dragonfly, backend developers had to actively maintain their cache layers between disk-based storage systems (like databases or object stores) and RAM. This involved manually deleting unnecessary keys, relying on automatic key expirations, or configuring eviction policies to free up memory.

With the introduction of SSD tiering, Dragonfly adds this additional spontaneous layer of caching. When this feature is enabled, Dragonfly automatically manages the data across memory and SSD storage, moving less frequently accessed values from RAM to SSD without needing to remove them from the data store. This allows us to leverage the speed of in-memory operations while effectively utilizing the capacity of SSD storage, achieving a balanced, cost-efficient storage solution for large datasets.

How to Use SSD Tiering in Dragonfly

Enabling SSD tiering in Dragonfly is straightforward and can be done using the --tiered_prefix server flag. Here's a step-by-step explanation on how to set it up and observe its effects based on our comprehensive demo below during the 2024-08 Community Catch-Up session. Remember that this is not a full benchmark but a presentation to showcase the SSD tiering feature.

Let's start by running a Dragonfly instance with a max memory limit of 28GB and keep SSD tiering disabled for now. Note that we are using the --dir flag to specify the working directory for Dragonfly and the --dbfilename flag to specify the name of the snapshot file.

./dragonfly --logtostderr --maxmemory=28GB --dir /mnt/vol1 --dbfilename backup

Next, we use a utility program to populate the instance with approximately 24.72GiB (26.54GB) of data to utilize the in-memory storage fully. The key-value pairs of the String data type being stored are generated randomly with a size of 1KB each. Once the instance is almost full, we can see that more than 27.5 million key-value pairs are stored in the instance.

dragonfly$> DBSIZE
(integer) 27510987

Then, we stop the Dragonfly instance, and it also creates a snapshot file in the working directory automatically. After the snapshot file is created and the instance is stopped, we restart Dragonfly, but this time with only a 14GB max memory limit and SSD tiering enabled. Note that we are using the --tiered_prefix flag to point to the NVMe SSD drive mounted at /mnt/ssd/vol1.

./dragonfly --logtostderr --maxmemory=14GB --dir /mnt/vol1 --dbfilename backup \
            --tiered_prefix /mnt/ssd/vol1

Upon restarting, Dragonfly automatically loads data from the snapshot file created above. After the restoration is done, we can verify that the instance still contains the exact same number of key-value pairs (27,510,987) as before. However, the memory usage is significantly lower, which doesn't go beyond 14GB, because Dragonfly has offloaded some of the data to the SSD.

To understand the impact on performance, benchmark the Dragonfly instance with SSD tiering enabled. We can observe that the throughput might be slightly lower and the latency somewhat higher compared to an instance normally we run without SSD tiering. Despite the slight decrease in throughput and increase in latency, the performance remains highly competitive. The median and average latency still stay in the sub-millisecond range (0.3ms median and 0.7ms average using the GCP instance in the video), and the throughput remains high. In exchange for this slight performance trade-off, we gain the benefit of utilizing SSD storage to store data, saving valuable memory resources. This demonstrates that SSD tiering offers a balanced approach to optimizing both memory usage and storage costs while maintaining robust performance.

Considerations for SSD Tiering

When deciding whether to enable SSD tiering in your Dragonfly instance, consider the following benefits and limitations.

Benefits of SSD Tiering

Optimized Memory Usage: SSD tiering spontaneously manages data by moving less frequently accessed values from RAM to SSD storage, ensuring that only the most critical hot data stays in fast and valuable memory.
Cost-Effective Scaling: By leveraging SSDs, which are much cheaper than RAM, tiered storage allows you to scale up data capacity without incurring high costs.
Support for Large Datasets: With SSD tiering, you can store and manage datasets that exceed the limit of available RAM, providing a flexible solution for applications requiring substantial data capacity without a considerable hardware upgrade.

What SSD Tiering Is Not For?

Not for Backups: SSD tiering is not a form of snapshotting or backup. It is not designed to store data permanently. Dragonfly still manages all the keys in memory, only offloading the values to SSD. If a Dragonfly instance is restarted, the data in the SSD tier will be lost.
Not for Primary Database: SSD tiering is not a replacement for a on-disk primary database. It is designed to complement in-memory storage by offloading a portion of the data to SSD.
Not for Ultra-Low Latency Applications: SSDs are slower than RAM, so SSD tiering is not suitable for applications that require ultra-low latency and ultra-high throughput. For such applications, it is recommended to use Dragonfly with in-memory storage only.

Conclusion

Dragonfly SSD tiering offers a new level of efficiency for managing large datasets by seamlessly moving less-used data to SSD storage while keeping hot data in memory. This approach helps optimize memory usage without compromising too much on performance.

Currently available as an alpha feature in the community edition starting v1.21, SSD tiering is just the beginning. We're looking forward to bringing this feature to our cloud offering soon. We believe this feature will significantly enhance your caching and data management experience, and we hope you find it just as powerful as we do. Don't hesitate to get started! Sign up for Dragonfly Cloud or check out the community edition on our GitHub repository to unlock the next-generation in-memory data store experience.