Bitmaps in Dragonfly: Compact Data with Powerful Analytics
Learn how to use Bitmaps for efficient data handling in Dragonfly. Explore key commands, master bit-level operations, and dive into real-world use cases like user retention and feature flags.
September 30, 2024
Introduction
Do you know that you can represent huge amounts of binary data super compactly by using just a few commands? That's where the Bitmap data type in Dragonfly comes in. Under the hood, bitmaps are stored as String values, but what makes them special is the ability to perform powerful bit-level operations. Whether you're counting active users across millions of entries or performing complex bitwise calculations, bitmaps offer a super-efficient way to handle binary data. Let's dive in and explore the related commands and use cases in this post!
Bitmap vs. String Data Type
A bitmap in Dragonfly is stored as a binary representation within a string value, so it is technically the same data type under the hood. While you can technically use bitmap-related commands on any string, it is recommended to not mix bitmap operations with regular string operations unless you are fully aware of the implications. Each bit in a bitmap can store a 0
or 1
value, offering a compact and efficient way to represent a large number of binary states. This makes bitmaps a natural choice for use cases where each bit acts as a flag or binary state, allowing for more focused manipulation than with typical string operations.
Let's take a look at some key commands for working with bitmaps:
SETBIT
: Set a specific bit in a bitmap to either1
or0
.
dragonfly$> SETBIT my_bitmap 1001 1
(integer) 0
The command above sets the bit at zero-indexed position 1001
to 1
. The return value indicates the previous value of that bit.
GETBIT
: Get the value of a specific bit in a bitmap.
dragonfly$> GETBIT my_bitmap 1000
(integer) 0
dragonfly$> GETBIT my_bitmap 1001
(integer) 1
This command returns the bit value at a position, showing whether that bit is set to 1
or 0
. In the example above, the bit at position 1000
is 0
, while the bit at position 1001
is 1
(as we set it in the previous command).
BITCOUNT
: Count the number of bits set to1
in a bitmap.
dragonfly$> BITCOUNT my_bitmap
(integer) 1
This command counts bits set to 1
in the bitmap. Since we set the bit at position 1001
to 1
and that's the only bit set to 1
, the count is 1
.
BITOP
: Perform bitwise operations on multiple bitmaps and store the result in a new bitmap.
# Set the first bit to 1 in the first source bitmap.
dragonfly$> SETBIT source_bitmap_01 0 1
(integer) 0
# Set the second bit to 1 in the second source bitmap.
dragonfly$> SETBIT source_bitmap_02 1 1
(integer) 0
# Perform a bitwise OR operation on the two source bitmaps and store the result in a new bitmap.
# The command returns the length of the resulting bitmap/string in bytes.
dragonfly$> BITOP OR result source_bitmap_01 source_bitmap_02
(integer) 1
dragonfly$> BITCOUNT result
(integer) 2
The command above performs a bitwise OR
operation on the two source bitmaps and store the result in a new bitmap.
- Let's try using regular string commands on a bitmap to see what happens:
dragonfly$> SETBIT my_bitmap 1001 1
(integer) 0
dragonfly$> STRLEN my_bitmap
(integer) 126
As you can see, we can technically use regular string commands on a bitmap, but if the command is not a read-only operation, it might lead to unexpected results. In the meantime, it is notable that we set the bit at position 1001
to 1
, so this bitmap must be able to store at least 1002 bits (the index is zero-based). Round 1002 bits up to the nearest multiple of 8 (as each byte stores 8 bits), and we get 1008 bits, which is 126 bytes.
- Last but not least, the
BITFIELD
command allows us to perform multiple bit-level operations in a single command, such as setting, getting, and incrementing bits. It is one of the most versatile and comprehensive commands for working with bitmaps, which also takes integer encoding into account, and you are encouraged to explore its capabilities in the documentation.
Now that we've covered the essential commands for working with bitmaps, let's explore some practical use cases where these bit-level operations can truly shine.
Use Case: Counting Monthly User Retention
Let's consider an example where we have a dataset with 100 million users. We can use a bitmap to track monthly user activity by assigning each user an ID and setting their corresponding bit if they were active that month. Note that in this case we are assuming that each user is represented by a unique integer ID, and the bit position in the bitmap corresponds to the user ID.
For instance, we might have bitmaps for August (monthly_users_2024_08
) and September (monthly_users_2024_09
). By using the BITCOUNT
command, we can quickly count the number of active users in a specific month:
dragonfly$> BITCOUNT monthly_users_2024_08
To see which users were active in both months, we can use the BITOP AND
command:
dragonfly$> BITOP AND result monthly_users_2024_08 monthly_users_2024_09
This provides an efficient way to compute retention, identifying users who were active across multiple periods.
However, it's important to take into account the memory usage and command complexity when working with larger bitmaps:
- Memory Usage: When dealing with 100 million users, each bitmap consumes around 12.5MB of memory (since 100 million bits equals roughly 12.5MB). While this may seem relatively small for monthly user tracking, it's important to consider that if you're tracking users on a weekly, daily, or even hourly basis, the memory requirements can add up significantly. And if we are looking at a regular string for caching for instance, 12.5MB is not a small amount of memory for a single key.
- Command Complexity: Both
BITCOUNT
andBITOP
commands operate with a time complexity ofO(N)
, meaning their speed is proportional to the size of the bitmap. Dragonfly is highly optimized and handles large datasets exceptionally fast, so managing big keys and hot keys is generally not a problem. However, for specialized analytics operations like this, it might be worth considering using a smaller, dedicated Dragonfly instance just for data analysis tasks. This separation can help avoid any interference with high-throughput operations on the main instance.
Use Case: Real-Time Feature Flags with Bitmap
Let's say we're managing global feature flags for an application, where each feature can be toggled on or off for all users. A bitmap provides a memory-efficient way to track whether a feature is globally enabled (1
) or disabled (0
). In the backend application code, we may use a Python enum
class to manage these feature flags programmatically.
For example, let's define a set of features using an enum
class in Python:
from enum import Enum
from redis import Redis as Dragonfly
# Connect to Dragonfly with a client SDK.
df = Dragonfly(host='localhost', port=6379)
# The key for storing global features.
GLOBAL_FEATURES = 'global_features'
# Define features using an enum class.
class Features(Enum):
NEW_DASHBOARD = 0
DARK_MODE = 1
BETA_SIGNUP = 2
Each feature corresponds to a bit position in a global bitmap. By using SETBIT
, we can enable or disable these features in real time.
To globally enable the NEW_DASHBOARD
feature:
# Enable the NEW_DASHBOARD feature.
df.setbit(GLOBAL_FEATURES, Features.NEW_DASHBOARD.value, 1)
To disable the DARK_MODE
feature:
# Disable the DARK_MODE feature.
df.setbit(GLOBAL_FEATURES, Features.DARK_MODE.value, 0)
You can check the status of a feature with GETBIT
:
# Check if the NEW_DASHBOARD feature is enabled.
enabled = df.getbit(GLOBAL_FEATURES, Features.NEW_DASHBOARD.value)
This setup allows us to manage the application's global feature flags in real time with minimal overhead, and we can dynamically add or remove features as needed by adjusting the bit positions. Similar ideas can be applied to user-specific feature flags, where each user has a unique bitmap to track their individual feature preferences.
Conclusion
The bitmap data type in Dragonfly offers powerful and efficient bitwise operations that can handle massive binary flags with ease. Whether you're tracking monthly user retention for millions of users or managing feature flags in a real-time system, bitmap commands enable quick calculations with efficient memory usage. If you haven't already, give Dragonfly a try to experience the next-generation in-memory data store that's built for speed and scalability.