Amazon Simple Storage Service (Amazon S3)

Adapting to change with data patterns on AWS: The “curate” cloud data pattern

As part of my re:Invent 2024 Innovation talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on the “Curate” data pattern, which we have seen more AWS customers adopt in the last 12-18 months as they look to leverage data sets for both analytics and AI […]

Adapting to change with data patterns on AWS: Aggregate, curate, and extend

At AWS re:Invent, I do an Innovation Talk on the emerging data trends that shape the direction of cloud data strategies. Last year, I talked about Putting Your Data to Work with Generative AI, which not only covered how data is used with foundation models, but also how businesses should think about storing and classifying […]

Analyzing Amazon S3 Metadata with Amazon Athena and Amazon QuickSight

UPDATE (1/27/2025): Amazon S3 Metadata is generally available. Object storage provides virtually unlimited scalability, but managing billions, or even trillions, of objects can pose significant challenges. How do you know what data you have? How can you find the right datasets at the right time? By implementing a robust metadata management strategy, you can answer these […]

Build a managed transactional data lake with Amazon S3 Tables

UPDATE (12/19/2024): Added guidance for Amazon EMR setup. Customers commonly use Apache Iceberg today to manage ever-growing volumes of data. Apache Iceberg’s relational database transaction capabilities (ACID transactions) help customers deal with frequent updates, deletions, and the need for transactional consistency across datasets. However, getting the most out of Apache Iceberg tables and running it […]

How Amazon S3 Tables use compaction to improve query performance by up to 3 times

Today businesses managing petabytes of data must optimize storage and processing to drive timely insights while being cost-effective. Customers often choose Apache Parquet for improved storage and query performance. Additionally, customers use Apache Iceberg to organize Parquet datasets to take advantage of its database-like features such as schema evolution, time travel, and ACID transactions. Customers […]

Manage costs for replicated delete markers in a disaster recovery setup on Amazon S3

Many businesses recognize the critical importance of safeguarding their essential data from potential disasters such as fires, floods, or ransomware events. Designing an effective disaster recovery (DR) strategy includes thoughtfully evaluating and selecting cost-effective solutions that fulfill compliance requirements. By using Amazon S3 features such as S3 object tags, S3 Versioning, and S3 Lifecycle, you can […]

Fundrise uses Amazon S3 Express One Zone to accelerate investment data processing

Fundrise is a financial technology company that brings alternative investments directly to individual investors. With more than 2 million users, Fundrise is one of the leading platforms of its kind in the United States. The challenge of providing a smooth, secure, and transparent experience for millions of users is largely unprecedented in the alternative investment […]

How Amazon Ads uses Iceberg optimizations to accelerate their Spark workload on Amazon S3

In today’s data-driven business landscape, organizations are increasingly relying on massive data lakes to store, process, and analyze vast amounts of information. However, as these data repositories grow to petabyte scale, a key challenge for businesses is implementing transactional capabilities on their data lakes efficiently. The sheer volume of data requires immense computational power and […]

How Delhivery migrated 500 TB of data across AWS Regions using Amazon S3 Replication

Delhivery is one of the largest third-party logistics providers in India. It fulfills millions of packages every day, servicing over 18,000 pin codes in India and powered by more than 20 automated sort centers, 90 warehouses, with over 2800 delivery centers. Data is at the core of the Delhivery’s business. In anticipating of potential regulatory […]

Migrate data from Dropbox to Amazon S3 using Rclone

Whether you choose to operate entirely on AWS or in multicloud and hybrid environments, one of the primary reasons to adopt AWS is the broad choice of services we offer, enabling you to explore, build, deploy, and monitor your workloads. Amazon S3 is a great option for Dropbox users seeking a comprehensive storage solution. Amazon […]

AWS Storage Blog

Tag: Amazon Simple Storage Service (Amazon S3)