Dragonfly Cloud is now available on the AWS Marketplace - Learn More

AWS re:Invent 2024 Top Data Infra Announcements and Sessions

Explore the top data infra announcements from AWS re:Invent 2024, including S3 Tables, Aurora DSQL, SageMaker Lakehouse, and more. Dragonfly aligns with cloud innovations, making sure our cloud offering is the best in class.

December 12, 2024

AWS re:Invent 2024 Top Data Infra Announcements and Sessions

Introduction

Last week, AWS re:Invent brought together technologists and industry leaders for an event filled with innovation and forward-looking ideas. The expo hall featured a wide array of projects, companies, and organizations, showcasing advancements in AI, cloud, and data technologies. At the same time, AWS introduced a host of new products and features, reflecting its commitment to driving the future of cloud infrastructure.

While it would be impossible to cover everything from such a packed event, this blog focuses on the announcements and updates that stood out in the realm of data infrastructure. Let's dive in!


S3 Tables: Store & Query Tabular Data at Scale

Amazon S3 has long been a reliable backbone for object storage, supporting a wide range of use cases. Over time, some innovative databases and streaming solutions like Neon, Databend, and WarpStream all started to utilize object stores like S3 as their storage layer.

With the introduction of S3 Tables, AWS is now formalizing this trend, optimizing S3 for tabular data and analytics workloads such as daily purchase transactions, sensor data, and advertisement impressions. By supporting the Apache Iceberg format and integration with tools like Amazon Athena and Apache Spark, S3 Tables provide streamlined analytics capabilities. Watch the video below to learn more about S3 Tables.

However, S3 doesn't stop with just tabular data optimization. Queryable object metadata is also available in preview, enabling users to run queries directly on S3 object metadata like object size, storage class, and encryption status. Learn more about this feature and everything else S3 has to offer in the video below.


Aurora DSQL: Serverless Distributed SQL Database

Amazon Aurora DSQL introduces a fully distributed, PostgreSQL-compatible distributed SQL database that promises to redefine transactional database management. With active-active writes across multiple regions and an innovative disaggregated architecture, it delivers high availability—99.99% in a single region and 99.999% in multi-region configurations—while maintaining strong ACID guarantees. Its serverless design eliminates operational overhead, making it easier for developers to focus on application logic without worrying about database maintenance.

This launch positions Aurora DSQL as a competitor to distributed SQL solutions like Google Cloud Spanner, TiDB, CockroachDB, and Neon. However, it's worth noting that as a just-launched offering, DSQL still lacks familiar PostgreSQL features. Even so, the introduction of Aurora DSQL adds healthy competition to the distributed SQL database landscape, which not only drives innovation among peers but also encourages the ecosystem as a whole to evolve and improve. Competition like this ultimately benefits everyone—developers, businesses, and end users alike.


SageMaker Lakehouse: Unified Analytics & AI/ML

Amazon SageMaker Lakehouse simplifies managing data by unifying data across S3 data lakes and Redshift data warehouses. It enables in-place queries using engines compatible with Apache Iceberg, eliminating silos and reducing the need for complex pipelines and duplicate data.

With zero-ETL integration for services like Aurora, RDS, and DynamoDB, along with centralized fine-grained permissions, SageMaker Lakehouse streamlines workflows and lowers costs. As part of the SageMaker ecosystem, it empowers teams to make faster, data-driven decisions while accelerating AI/ML development.


Bedrock Knowledge Bases: Data Processing with GenAI

Amazon Bedrock Knowledge Bases now enables users to query structured data conversationally, translating natural language into SQL. Capabilities like this not only lower barriers for non-technical users but also make complex datasets more accessible for everyone. Text-to-SQL is not the only capability introduced. Combined with enhancements like multimodal data processing and GraphRAG, Bedrock simplifies data processing in many ways.


Trainium2 Instances: Pushing AI/ML Boundaries

Amazon EC2 trn2 instances, powered by the second-generation AWS Trainium chips (AWS Trainium2), deliver exceptional performance for AI/ML training and inference. Offering up to 4x faster processing, 4x more memory bandwidth, and 3x more memory capacity than their predecessors. Each instance features 16 Trainium2 chips, 192 vCPUs, 2 TiB of memory, and 3.2 Tbps Elastic Fabric Adapter (EFA) bandwidth for optimized training and real-time inference. The trn2 UltraServers take scalability further by integrating 64 Trainium2 chips with high-bandwidth NeuronLink interconnects, enabling efficient training and inference for trillion-parameter models and beyond.

Supported by tools like the AWS Neuron SDK, PyTorch, and JAX, these Trainium2-based instances and servers simplify adoption and optimization for advanced AI/ML workflows. Either training frontier foundation models or delivering low-latency inference, Trainium2 instances set a high standard for machine learning compute.


Conclusion

The cloud landscape continues to evolve rapidly, and it's exciting to see so many innovations unveiled at AWS re:Invent 2024! For a broader take on the top announcements across all sectors, check them out here.

At Dragonfly, we are proud of our innovative in-memory data store, built on groundbreaking architecture with features like multi-threading, an efficient snapshotting algorithm, and advanced data structures. These features make Dragonfly the most performant in-memory data store on the planet. However, some aspects, such as hardware and physical networking, remain outside the scope of the Dragonfly open-source project.

As a team, we actively engage with and explore innovations from major cloud providers like AWS, exploring how advancements in areas like S3, networking, and AI can enhance Dragonfly Cloud. This ensures that we not only maintain the extreme performance, competitiveness, and robustness of Dragonfly, the core software, but also push the boundaries of what we can offer to our users in a modern cloud-native environment.

Stay up to date on all things Dragonfly

Join our community for unparalleled support and insights

Join

Switch & save up to 80% 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost