AWS Storage Blog
Category: AWS re:Invent
Adapting to change with data patterns on AWS: The “extend” cloud data pattern
As part of my re:Invent 2024 Innovation Talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on “Extend” which is an emerging data pattern. You can also watch this four-minute video clip on the Extend data pattern if interested. Many companies find great success with the […]
Adapting to change with data patterns on AWS: The “aggregate” cloud data pattern
As part of my re:Invent 2024 Innovation talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on the “Aggregate” cloud data pattern, which is the most commonly adopted across AWS customers. You can also watch this six-minute video clip on the Aggregate data pattern for a […]
Adapting to change with data patterns on AWS: The “curate” cloud data pattern
As part of my re:Invent 2024 Innovation talk, I shared three data patterns that many of our largest AWS customers have adopted. This article focuses on the “Curate” data pattern, which we have seen more AWS customers adopt in the last 12-18 months as they look to leverage data sets for both analytics and AI […]
Adapting to change with data patterns on AWS: Aggregate, curate, and extend
At AWS re:Invent, I do an Innovation Talk on the emerging data trends that shape the direction of cloud data strategies. Last year, I talked about Putting Your Data to Work with Generative AI, which not only covered how data is used with foundation models, but also how businesses should think about storing and classifying […]
How Amazon S3 Tables use compaction to improve query performance by up to 3 times
Today businesses managing petabytes of data must optimize storage and processing to drive timely insights while being cost-effective. Customers often choose Apache Parquet for improved storage and query performance. Additionally, customers use Apache Iceberg to organize Parquet datasets to take advantage of its database-like features such as schema evolution, time travel, and ACID transactions. Customers […]
Amazon S3 Express One Zone delivers cost and performance gains for ChaosSearch customers
ChaosSearch is an Amazon S3-native database built on a serverless, stateless compute architecture within AWS that delivers live search, SQL, and Generative AI analytics. At ChaosSearch, the speed and performance of our architecture is important to us and our customers because time to results is the difference between success and failure, and we rely on […]
Akridata accelerates processing of unstructured data with Amazon S3 Express One Zone
Deep learning processes often need to read full datasets, which are usually hundreds of gigabytes in size, before they can perform intelligent data processing. High data retrieval speed and low latency from storage are crucial for enterprises running these performance-critical workloads. Akridata, an AWS independent software vendor (ISV) partner, helps make artificial intelligence (AI)-assisted unstructured-data […]
lakeFS and Amazon S3 Express One Zone: Highly performant data version control for ML/AI
Machine learning presents a number of new challenges to data teams, calling for technology solutions that can support training and fine-tuning performance-critical workloads with high performance. Data version control is one of the facets of high-performing ML pipelines, as it allows efficient experimentation and full ML pipeline reproducibility at scale. lakeFS by Treeverse, an AWS […]
ClickHouse Cloud & Amazon S3 Express One Zone: Making a blazing fast analytical database even faster
ClickHouse is a columnar database management system (DBMS) designed for blazing-fast real-time analytics. It was built to address the needs of interactive analytical applications requiring up-to-the-second analytics. To do that, it must support real-time data ingestion at the rate of hundreds of millions of events per second and run complex analytical queries, such as filtering, […]
Streamline data sharing and access control with Informatica Cloud Data Marketplace and Amazon S3 Access Grants
Organizations are modernizing their data lakes on Amazon Simple Storage Service (Amazon S3) to handle the ever-growing data volume and speed while meeting the demands of analytics, machine learning (ML), artificial intelligence (AI), and generative AI applications. To enable a data-driven culture and remain innovative, the data platform must allow for data-centric collaboration across business […]