Coming November 7th: An Introduction To Dragonfly Cloud - register

Bitmaps in Dragonfly: Compact Data with Powerful Analytics

Learn how to use Bitmaps for efficient data handling in Dragonfly. Explore key commands, master bit-level operations, and dive into real-world use cases like user retention and feature flags.

September 30, 2024

Bitmaps in Dragonfly: Compact Data with Powerful Analytics

Introduction

Do you know that you can represent huge amounts of binary data super compactly by using just a few commands? That's where the Bitmap data type in Dragonfly comes in. Under the hood, bitmaps are stored as String values, but what makes them special is the ability to perform powerful bit-level operations. Whether you're counting active users across millions of entries or performing complex bitwise calculations, bitmaps offer a super-efficient way to handle binary data. Let's dive in and explore the related commands and use cases in this post!


Bitmap vs. String Data Type

A bitmap in Dragonfly is stored as a binary representation within a string value, so it is technically the same data type under the hood. While you can technically use bitmap-related commands on any string, it is recommended to not mix bitmap operations with regular string operations unless you are fully aware of the implications. Each bit in a bitmap can store a 0 or 1 value, offering a compact and efficient way to represent a large number of binary states. This makes bitmaps a natural choice for use cases where each bit acts as a flag or binary state, allowing for more focused manipulation than with typical string operations.

Let's take a look at some key commands for working with bitmaps:

  1. SETBIT: Set a specific bit in a bitmap to either 1 or 0.

    dragonfly$> SETBIT my_bitmap 1001 1
    (integer) 0
    

    The command above sets the bit at zero-indexed position 1001 to 1. The return value indicates the previous value of that bit.

  2. GETBIT: Get the value of a specific bit in a bitmap.

    dragonfly$> GETBIT my_bitmap 1000
    (integer) 0
    dragonfly$> GETBIT my_bitmap 1001
    (integer) 1
    

    This command returns the bit value at a position, showing whether that bit is set to 1 or 0. In the example above, the bit at position 1000 is 0, while the bit at position 1001 is 1 (as we set it in the previous command).

  3. BITCOUNT: Count the number of bits set to 1 in a bitmap.

    dragonfly$> BITCOUNT my_bitmap
    (integer) 1
    

    This command counts bits set to 1 in the bitmap. Since we set the bit at position 1001 to 1 and that's the only bit set to 1, the count is 1.

  4. BITOP: Perform bitwise operations on multiple bitmaps and store the result in a new bitmap.

    # Set the first bit to 1 in the first source bitmap.
    dragonfly$> SETBIT source_bitmap_01 0 1
    (integer) 0
    
    # Set the second bit to 1 in the second source bitmap.
    dragonfly$> SETBIT source_bitmap_02 1 1
    (integer) 0
    
    # Perform a bitwise OR operation on the two source bitmaps and store the result in a new bitmap.
    # The command returns the length of the resulting bitmap/string in bytes.
    dragonfly$> BITOP OR result source_bitmap_01 source_bitmap_02
    (integer) 1  
    dragonfly$> BITCOUNT result
    (integer) 2
    

    The command above performs a bitwise OR operation on the two source bitmaps and store the result in a new bitmap.

  5. Let's try using regular string commands on a bitmap to see what happens:

    dragonfly$> SETBIT my_bitmap 1001 1
    (integer) 0
    dragonfly$> STRLEN my_bitmap
    (integer) 126
    

    As you can see, we can technically use regular string commands on a bitmap, but if the command is not a read-only operation, it might lead to unexpected results. In the meantime, it is notable that we set the bit at position 1001 to 1, so this bitmap must be able to store at least 1002 bits (the index is zero-based). Round 1002 bits up to the nearest multiple of 8 (as each byte stores 8 bits), and we get 1008 bits, which is 126 bytes.

  6. Last but not least, the BITFIELD command allows us to perform multiple bit-level operations in a single command, such as setting, getting, and incrementing bits. It is one of the most versatile and comprehensive commands for working with bitmaps, which also takes integer encoding into account, and you are encouraged to explore its capabilities in the documentation.

Now that we've covered the essential commands for working with bitmaps, let's explore some practical use cases where these bit-level operations can truly shine.


Use Case: Counting Monthly User Retention

Let's consider an example where we have a dataset with 100 million users. We can use a bitmap to track monthly user activity by assigning each user an ID and setting their corresponding bit if they were active that month. Note that in this case we are assuming that each user is represented by a unique integer ID, and the bit position in the bitmap corresponds to the user ID.

For instance, we might have bitmaps for August (monthly_users_2024_08) and September (monthly_users_2024_09). By using the BITCOUNT command, we can quickly count the number of active users in a specific month:

dragonfly$> BITCOUNT monthly_users_2024_08

To see which users were active in both months, we can use the BITOP AND command:

dragonfly$> BITOP AND result monthly_users_2024_08 monthly_users_2024_09

This provides an efficient way to compute retention, identifying users who were active across multiple periods.

However, it's important to take into account the memory usage and command complexity when working with larger bitmaps:

  • Memory Usage: When dealing with 100 million users, each bitmap consumes around 12.5MB of memory (since 100 million bits equals roughly 12.5MB). While this may seem relatively small for monthly user tracking, it's important to consider that if you're tracking users on a weekly, daily, or even hourly basis, the memory requirements can add up significantly. And if we are looking at a regular string for caching for instance, 12.5MB is not a small amount of memory for a single key.
  • Command Complexity: Both BITCOUNT and BITOP commands operate with a time complexity of O(N), meaning their speed is proportional to the size of the bitmap. Dragonfly is highly optimized and handles large datasets exceptionally fast, so managing big keys and hot keys is generally not a problem. However, for specialized analytics operations like this, it might be worth considering using a smaller, dedicated Dragonfly instance just for data analysis tasks. This separation can help avoid any interference with high-throughput operations on the main instance.

Use Case: Real-Time Feature Flags with Bitmap

Let's say we're managing global feature flags for an application, where each feature can be toggled on or off for all users. A bitmap provides a memory-efficient way to track whether a feature is globally enabled (1) or disabled (0). In the backend application code, we may use a Python enum class to manage these feature flags programmatically.

For example, let's define a set of features using an enum class in Python:

from enum import Enum
from redis import Redis as Dragonfly

# Connect to Dragonfly with a client SDK.
df = Dragonfly(host='localhost', port=6379)

# The key for storing global features.
GLOBAL_FEATURES = 'global_features'

# Define features using an enum class.
class Features(Enum):
    NEW_DASHBOARD = 0
    DARK_MODE = 1
    BETA_SIGNUP = 2

Each feature corresponds to a bit position in a global bitmap. By using SETBIT, we can enable or disable these features in real time.

To globally enable the NEW_DASHBOARD feature:

# Enable the NEW_DASHBOARD feature.
df.setbit(GLOBAL_FEATURES, Features.NEW_DASHBOARD.value, 1)

To disable the DARK_MODE feature:

# Disable the DARK_MODE feature.
df.setbit(GLOBAL_FEATURES, Features.DARK_MODE.value, 0)

You can check the status of a feature with GETBIT:

# Check if the NEW_DASHBOARD feature is enabled.
enabled = df.getbit(GLOBAL_FEATURES, Features.NEW_DASHBOARD.value)

This setup allows us to manage the application's global feature flags in real time with minimal overhead, and we can dynamically add or remove features as needed by adjusting the bit positions. Similar ideas can be applied to user-specific feature flags, where each user has a unique bitmap to track their individual feature preferences.


Conclusion

The bitmap data type in Dragonfly offers powerful and efficient bitwise operations that can handle massive binary flags with ease. Whether you're tracking monthly user retention for millions of users or managing feature flags in a real-time system, bitmap commands enable quick calculations with efficient memory usage. If you haven't already, give Dragonfly a try to experience the next-generation in-memory data store that's built for speed and scalability.

Stay up to date on all things Dragonfly

Join our community for unparalleled support and insights

Join

Switch & save up to 80% 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost