AWS Storage Blog
Category: AWS CLI
Scalable cross-cloud data migration to Amazon S3 with distributed rclone
Migrating petabytes of data across cloud providers is one of the most operationally demanding tasks an organization can take on. At this scale, simple transfer approaches break down. Teams lose track of what has been copied and what has failed. Transfers stall and require constant manual intervention to restart. In some cases, teams need to […]
Implement single-exchange tokens for short-lived Amazon S3 presigned URLs with Terraform
Organizations across industries use signed URLs to grant temporary, credential-less access to private resources such as receipts, medical or financial records, legal files, or confidential reports. However, signed URLs can be reused by anyone until they expire, creating security risks if a URL is shared or inadvertently disclosed. This risk can be mitigated by vending […]
Building automated AWS Regional availability checks with Amazon S3
Every day, organizations expand into new markets, migrate critical workloads across geographies, and build systems that need to operate reliably in multiple locations. At the root of these efforts is a simple question: “What can I deploy, and where?” The answer shapes important architecture decisions, from which AWS Regions to expand into, to how you […]
Applying Amazon S3 Object Lock at scale for petabytes of existing data
Organizations with petabytes of data in the cloud need a way to apply immutable storage protections to data that’s already been stored—whether for regulatory compliance or cyber resilience. Although you can enable write-once-read-many (WORM) controls for newly created storage, applying these protections to existing enterprise data at scale requires a systematic approach. Regulated industries have […]
Build intelligent ETL pipelines using AWS Model Context Protocol and Amazon Q
Data scientists and engineers spend hours writing complex data pipelines to extract, transform, and load (ETL) data from various sources into their data lakes for data integration and creating unified data models to build business insights. The process involves understanding the source and target systems, discovering schemas, mapping source and target, writing and testing ETL […]
Derive intelligent storage insights using S3 Metadata and Model Context Protocol (MCP)
Organizations face mounting challenges in managing and operationalizing their ever-growing data assets for machine learning and analytics workflows. When dealing with billions and trillions of objects, teams struggle to find what data they have and how to efficiently find specific datasets. Without proper data discovery and metadata management, teams spend valuable time searching for relevant […]
Accelerating Amazon S3 Batch Operations at scale with on-demand manifest generation
Modern enterprises routinely manage billions of objects across their cloud storage environments, needing efficient bulk operations for disaster recovery, compliance management, data transfer, and cost optimization. Performing these operations manually or through custom scripts becomes impractical at scale, often creating operational bottlenecks when time-sensitive actions are necessary. Organizations frequently need to identify and process specific […]
Faster threat detection at scale: Real-time cybersecurity graph analytics with PuppyGraph and Amazon S3 Tables
Modern cybersecurity teams are facing unprecedented challenges in data analysis by the scale, complexity, and velocity of data. Cloud environments continuously generate massive amounts information in form of access logs, configuration changes, alerts, and telemetry. Traditional analysis methods of looking at these data points in isolation can’t effectively detect threats such as lateral movement and […]
Query Amazon S3 Tables from open source Trino using Apache Iceberg REST endpoint
Organizations are increasingly focused on addressing the growing challenge of managing and analyzing vast data volumes, while making sure that their data teams have timely access to this data to enable rapid insights and decision-making. Data analysts and scientists need self-service analytics capabilities to build and maintain data products, often involving complex transformations and frequent […]
Configuring notifications to monitor AWS Backup jobs
Your data and applications are some of the most valuable assets in your company. The ability to protect your data and IT infrastructure against cyber-attacks, accidental deletion, natural disasters, and several other threats is essential to secure your business. An important aspect to protecting your assets is the ability to monitor backup, restore, and copy […]






