AWS Storage Blog
Tag: Amazon Simple Storage Service (Amazon S3)
How Bridgewater maintains data consistency across Regions using Amazon S3 Replication
Bridgewater Associates is a global macro investment manager, with a core mission of understanding how the world’s markets and economies work by analyzing the drivers of markets and turning that understanding into high-quality portfolios and investment advice for their clients. The data that drives this economic research is stored in Bridgewater’s data lake, built on […]
Streamlining access to tabular datasets stored in Amazon S3 Tables with DuckDB
As businesses continue to rely on data-driven decision-making, there’s an increasing demand for tools that streamline and accelerate the process of data analysis. Efficiency and simplicity in application architecture can serve as a competitive edge when driving high-stakes decisions. Developers are seeking lightweight, flexible tools that seamlessly integrate with their existing application stack, specifically solutions […]
Connect Snowflake to S3 Tables using the SageMaker Lakehouse Iceberg REST endpoint
Organizations today seek data analytics solutions that provide maximum flexibility and accessibility. Customers need their data to be readily available using their preferred query engines, and break down barriers across different computing environments. At the same time, they want a single copy of data to be used across these solutions, to track lineage, be cost […]
Build a managed Apache Iceberg data lake using Starburst and Amazon S3 Tables
Managing large-scale data analytics across diverse data sources has long been a challenge for enterprises. Data teams often struggle with complex data lake configurations, performance bottlenecks, and the need to maintain consistent data governance while enabling broad access to analytics capabilities. Today, Starburst announces a powerful solution to these challenges by extending their Apache Iceberg […]
Build a data lake for streaming data with Amazon S3 Tables and Amazon Data Firehose
Businesses are increasingly adopting real-time data processing to stay ahead of user expectations and market changes. Industries such as retail, finance, manufacturing, and smart cities are using streaming data for everything from optimizing supply chains to detecting fraud and improving urban planning. The ability to use data as it is generated has become a critical […]
Optimizing Amazon FSx for Lustre storage consumption using automatic data tiering with Amazon S3
Managing high-performance file storage can be a significant operational and cost challenge for many organizations, especially those running compute-intensive workloads such as high-performance computing (HPC) or data analytics. This is particularly true for organizations with existing data lakes on Amazon S3 who need POSIX-compliant, high-performance file system access. Amazon FSx for Lustre provides a scalable, […]
Integrating custom metadata with Amazon S3 Metadata
Organizations of all sizes face a common challenge: efficiently managing, organizing, and retrieving vast amounts of digital content. From images and videos to documents and application data, businesses are inundated with information that needs to be stored securely, accessed quickly, and analyzed effectively. The ability to extract, manage, and use metadata from this content is […]
Design patterns for multi-tenant access control on Amazon S3
Large organizations and software as a service (SaaS) platforms often share storage resources across multiple users, groups, or tenants. The design pattern chosen to implement this shared storage can significantly impact how access permissions are managed at scale. This decision is key because it directly affects platforms’ security and ease of scale. A well thought […]
Optimizing data transfers for high throughput life science instruments using AWS DataSync
Healthcare and life sciences (HCLS) customers are generating more data than ever as they integrate the use of omics data with applications in drug discovery, clinical development, molecular diagnostics, and population health. The rate and volume of data that HCLS laboratories generate are a reflection of their lab instrumentation and day-to-day lab operations. Efficiently moving […]
Archiving relational databases to Amazon S3 Glacier storage classes for cost optimization
Many customers are growing their data footprints rapidly, with significantly more data stored in their relational database management systems (RDBMS) than ever before. Additionally, organizations subject to data compliance including the Health Insurance Portability and Accountability Act (HIPAA), the Payment Card Industry Data Security Standard (PCI-DSS) and General Data Protection Regulation (GDPR) are often required […]