Dragonfly Is Production Ready (and we raised $21m)

TL;DR

We are pleased to announce that Dragonfly 1.0, the most performant in-memory datastore for cloud workloads, is now generally available. We’re also excited to share that we’ve raised $21m from fantastic investors (Redpoint and Quiet Capital) who share our vision of enabling the software development community to power high-scale, real-time applications and accelerate human innovation. Dragonfly 1.0 comes with full support for Redis’ most common data types and commands, as well as snapshotting, replication and high availability.

Background

For the last 15 years, Redis has been the primary technology for developers looking to provide a real-time experience to their users. Over this period, the amount of data the average application uses has increased dramatically, as has the available hardware to serve that data. Readily available cloud instances today have at least 10X the CPUs and 100X more memory than their equivalent counterparts had 15 years ago. However, the single-threaded design of Redis has not evolved to meet modern data demands nor to take full advantage of modern hardware.

The misalignment between the data demands of the modern application and the underlying system design of Redis has hindered developer productivity and innovation. The largest web scale companies in the world have attempted to solve the problem by throwing massive amounts of resources at it, including large dedicated teams who solely own the caching and data layer. Developers at companies that are not technology giants like Google, Facebook, Microsoft have been saddled with unnecessary overhead and complexity and forced to make very tough choices. They are forced to choose between keeping data workloads small, which limits the experience they can deliver in their application, or dedicating time and resources to managing complex and expensive infrastructure configurations (which comes at the expense of building valuable features for customers). My co-founder Roman and I experienced this tradeoff firsthand as heavy Redis users in our previous roles (we worked together at both Google and Ubimo and Roman was a principal engineer on, probably, the largest in-memory managed service in the world, AWS Elasticache).

As engineers, we want a single Redis endpoint that is ultra-performant, extremely efficient in its memory utilization, and, most importantly, scales effortlessly. We don’t want to be forced to choose between delivering a great user experience and keeping our infrastructure simple and affordable. As founders, we want to provide a solution that gives developers superpowers to build any application with the peace of mind that the infrastructure will support them in their quest to innovate and not bog them down. This is why we built Dragonfly.

While we are just beginning this journey, the excitement and adoption of Dragonfly by a diverse group of developers has energized us since we made the project public nine months ago. In that short period, many developers have replaced their legacy Redis instances with Dragonfly, increasing performance while reducing infrastructure complexity. Most of them came to Dragonfly because they needed more scale, and they’ve stayed because their latency has dropped, their infrastructure bill has gone down, and it just works.

Dragonfly 1.0

Dragonfly is a drop-in Redis replacement that can, unlike Redis, scale vertically to support millions of operations per second and terabyte sized workloads, all on a single instance. This makes the community edition of a single Dragonfly instance as powerful as a cluster of Redis instances and much more cost effective than any managed Redis service.

Since its launch in May last year, our team has dedicated countless hours to enhancing the performance, stability, and resilience of Dragonfly. Our incredible community has identified hundreds of issues and contributed fixes for many of them as well. Our early adopters have helped harden Dragonfly through their production usage and get us to where we are today, Dragonfly 1.0. We encourage you to put it to the test, push its limits, and share your feedback with us. We eagerly await your thoughts and insights!

Dragonfly 1.0 is the result of a large investment in development hours that we have made in four critical areas: performance, scale, efficiency, and reliability.

Performance

The best applications in the world have an incredibly snappy user experience with real-time responses and interactions. Building these types of applications requires a constant flow of data to be transmitted in a low latency and efficient manner. If the underlying infrastructure cannot handle the peak throughput demand, the application will experience delays, lags, or dropouts, which can significantly impact the quality and usability of the application. The largest, most successful software companies in the world know that even 10 additional milliseconds of latency can have a material impact on their business, and are investing billions of dollars in ensuring peak performance.

Dragonfly was designed so that any engineering team, not only the ones with the most resources, can deliver this type of experience. Dragonfly uses asynchronous, multi-threaded processing to fully utilize the hardware’s computing, memory and networking resources to deliver consistent sub-millisecond latency with throughput as high as 4 million queries per second (25X Redis) on a single instance.

Achieving this level of performance from the underlying infrastructure simplifies application code and makes it extremely easy for developers to build real time applications and pipelines.

Dragonfly throughput is 25X higher than Redis for both GET and SET operations.

Scale

Scaling a software architecture to meet the demands of a growing business is one of the biggest challenges we have in the software industry and involves a constant battle of trade offs. It is particularly challenging in regards to in-memory workloads due to the inability to decouple compute and storage as with stateless systems.

A common tradeoff we see in architectures utilizing Redis is the tradeoff between scale (memory) and complexity; keep the architecture simple and limit scale, or adopt a more complicated cluster architecture in order to scale your data. However, with modern hardware there is no reason to scale out for workloads with dozens of gigabytes of memory or 100,000 queries per second.

We designed Dragonfly to take advantage of modern hardware and scale vertically first, which is as simple as increasing the size of a cloud instance. A single Dragonfly instance can scale up to 4 million queries per second and 1 terabyte of data, allowing it to accommodate a wide range of in-memory use cases. For those seeking even more, keep an eye out for upcoming Dragonfly releases!

Efficiency

We are big believers that systems should be efficient, minimizing wasted resources as much as possible. Not only does efficiency save money on hardware, but it simplifies many of the operational challenges that come with modern software architectures. This is why we optimized Dragonfly’s main hashtable to require 30% less memory for small keys when compared to Redis. We also added specific memory optimizations to popular data types such as strings that are up to 12.5% more efficient and sets that are up to 40% more efficient in memory vs other memory stores. This allows developers to do more with less resources and translates to immediate cost savings.

Reliability

Reliability is critical for all systems, but especially so for in-memory systems. Snapshotting and replication are two of the most common causes of outages in Redis and we knew from the get-go that we have to make these processes highly reliable with Dragonfly. Dragonfly supports 7.5X higher throughput with replication when compared with Redis while the snapshotting phase is 12X faster. The result is that developers no longer need to fret that snapshotting or replication will bring down their instance. They just work as expected.

The Future

Dragonfly 1.0 is a huge milestone but it’s only the beginning of the journey for us and for the community that has formed around this project. While we believe the 4 critical areas outlined above are essential for a modern in-memory datastore, we also believe that any great developer tool should be simple to use. Building simple things is hard, which is why we’ve taken almost a year to arrive at this GA and why we’ve raised capital to continue to invest in building this system the right way. We want to enable all developers and companies to create ultra-fast applications that are both simple and scalable.

There are still many challenges to solve in order for us to fulfill our vision of unleashing the next wave of developer innovation. We will invest this new capital in continuing to evolve Dragonfly so that it can address more use cases, handle larger workloads, and make the developer experience simpler.

The first step in this next phase of development will include using SSD storage to transparently extend the main memory while preserving low-latency characteristics of Dragonfly. This will allow for even more efficient use of the hardware and much lower total cost of ownership.

We will also be launching a cloud offering of Dragonfly for those who do not want to manage the underlying infrastructure, you can sign up for the wait list here.

If you would like to follow along on our journey, provide feedback or join the community please join our Discord or follow us on Twitter.