AWS Storage Blog
Tag: AWS Cloud Storage
Adapting to change with data patterns on AWS: The “extend” cloud data pattern
As part of my re:Invent 2024 Innovation Talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on “Extend” which is an emerging data pattern. You can also watch this four-minute video clip on the Extend data pattern if interested. Many companies find great success with the […]
Adapting to change with data patterns on AWS: The “aggregate” cloud data pattern
As part of my re:Invent 2024 Innovation talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on the “Aggregate” cloud data pattern, which is the most commonly adopted across AWS customers. You can also watch this six-minute video clip on the Aggregate data pattern for a […]
Adapting to change with data patterns on AWS: The “curate” cloud data pattern
As part of my re:Invent 2024 Innovation talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on the “Curate” data pattern, which we have seen more AWS customers adopt in the last 12-18 months as they look to leverage data sets for both analytics and AI […]
Enhance logs for AWS Elastic Disaster Recovery with CloudWatch Log Insights
Operational teams play a crucial role in making sure of the readiness and reliability of a disaster recovery (DR) solution. When these teams don’t have direct access to monitor the resources and services that make up a solution, it can create significant challenges. Logs provide insights into system behaviors, performance, and potential anomalies. When operations […]
Adapting to change with data patterns on AWS: Aggregate, curate, and extend
At AWS re:Invent, I do an Innovation Talk on the emerging data trends that shape the direction of cloud data strategies. Last year, I talked about Putting Your Data to Work with Generative AI, which not only covered how data is used with foundation models, but also how businesses should think about storing and classifying […]
Enhance business continuity within an Availability Zone using AWS Elastic Disaster Recovery
At Amazon Web Services (AWS), we recommend running workloads across multiple Availability Zones (AZ) for high availability and fault tolerance. However, there are certain situations where users need to run their workloads in a single AZ. These include legacy or commercial off the shelf (COTS) applications that don’t support deployments across multiple AZ, workloads that […]
Analyzing Amazon S3 Metadata with Amazon Athena and Amazon QuickSight
Object storage provides virtually unlimited scalability, but managing billions, or even trillions, of objects can pose significant challenges. How do you know what data you have? How can you find the right datasets at the right time? By implementing a robust metadata management strategy, you can answer these questions, gain better control over massive data […]
Build a managed transactional data lake with Amazon S3 Tables
UPDATE (12/19/2024): Added guidance for Amazon EMR setup. Customers commonly use Apache Iceberg today to manage ever-growing volumes of data. Apache Iceberg’s relational database transaction capabilities (ACID transactions) help customers deal with frequent updates, deletions, and the need for transactional consistency across datasets. However, getting the most out of Apache Iceberg tables and running it […]
Uncover new performance insights using Amazon EBS detailed performance statistics
As businesses increasingly rely on latency-sensitive applications for mission-critical workloads, the need to understand performance across the entire technology stack is essential to swiftly resolve performance bottlenecks that could affect application efficiency. Given that storage performance and stability directly impact application efficiency, reliability, scalability, and user experience, it is paramount for organizations to have the […]
How Amazon S3 Tables use compaction to improve query performance by up to 3 times
Today businesses managing petabytes of data must optimize storage and processing to drive timely insights while being cost-effective. Customers often choose Apache Parquet for improved storage and query performance. Additionally, customers use Apache Iceberg to organize Parquet datasets to take advantage of its database-like features such as schema evolution, time travel, and ACID transactions. Customers […]