AWS Storage Blog

Large scale migration of encrypted objects in Amazon S3 using S3 Batch Operations

Many organizations have data governance strategies or compliance requirements that mandate their data be replicated and redundant across different management accounts and global regions. Moving encrypted data at scale can often take a few additional steps due to the need to decrypt and re-encrypt objects as part of the replication process.

Amazon Simple Storage Service (Amazon S3) offers three options for server-side encryption: Amazon S3 managed server-side encryption (SSE-S3), server-side encryption with AWS KMS keys (SSE-KMS), and server-side encryption with customer provided keys (SSE-C). As of a recent update, SSE-S3 is applied automatically to all new objects as the default if you haven’t choosing another encryption method. You can easily perform large-scale Amazon S3 operations using Amazon S3 Batch Operations, including migrating or replicating your encrypted data to different accounts.

In this post, we walk through migrating new and existing S3 objects encrypted with SSE-KMS keys when the source and destination S3 buckets are owned by different AWS accounts in the same AWS Region. We accomplish this with S3 Batch Operations, which lets you perform large-scale batch operations on S3 objects. You can use the solution in this post to minimize latency by maintaining copies of your data in AWS Regions geographically closer to your users, to meet compliance and data sovereignty requirements, and to create additional disaster recovery resiliency.

Solution overview

Amazon S3 Batch Replication, through a Batch Operations job, provides a method for replicating objects that existed before a replication configuration was in place, objects that you have previously replicated, and objects that have failed replication. This solution helps you accomplish cross-account Amazon S3 Batch Replication.

  • You will configure an Amazon S3 Replication rule that enables automatic, asynchronous copying of new encrypted S3 objects in your source S3 bucket in AWS account A to a destination S3 bucket in AWS account B.
  • You will use Amazon S3 Batch Replication to replicate existing encrypted S3 objects in your source S3 bucket in AWS account A to a destination S3 bucket in AWS account B.

The following diagram illustrates the solution overview.

Prerequisites

For this walk-through, you need the following:

Solution walkthrough

The high-level steps, followed by a more in-depth walkthrough:

In Account A

  1. Create IAM role cross_account_replication and update the role with SSE-KMS key (a) and SSE-KMS key (b) specific permissions.
  2. Create IAM role s3_batch_operations.

In Account B

  1. Update S3 bucket policy with permissions for Account A cross_account_replication IAM role.
  2. Create a KMS key (b) and update the key policy with permissions for Account A cross_account_replication IAM role.

In Account A

Once Steps 1 through 4 are completed, navigate to Account A and

  1. Create Amazon S3 Replication rule in the source S3 bucket by selecting ‘Replicate objects encrypted with AWS KMS’.
  2. Create an Amazon S3 Batch Operations job to replicate existing encrypted objects.

In AWS account A

  1. Creating IAM role cross_account_replication

To replicate existing S3 objects that are encrypted using AWS KMS, we must grant additional permissions to the IAM role that you will specify in the replication configuration.

You can configure your bucket to use an S3 Bucket Key which will decrease the request traffic from Amazon S3 to AWS KMS and reduce the cost of SSE-KMS. We recommend you to enable S3 Bucket Keys in the source and destination buckets before the replication is initiated. The savings are greatest if enabled before the initial PUT operation.

  • When an S3 Bucket Key is enabled for the source and destination bucket, the encryption context will be the bucket Amazon Resource Name (ARN) and not the object ARN, for example, arn:aws:s3:::bucket_ARN. You must update your IAM policies to use the bucket ARN for the encryption context. The following example shows the encryption context with the S3 bucket ARN.
"kms:EncryptionContext:aws:s3:arn": [
"arn:aws:s3:::bucket-name"
]
  • If an S3 Bucket Key is only enabled on the destination bucket and not the source bucket, then you must use the object ARN, for example arn:aws:s3:::bucket-name/*. The following example shows the encryption context with the S3 object ARN.
"kms:EncryptionContext:aws:s3:arn": [
"arn:aws:s3:::bucket-name/*"
]

Now, let’s create an IAM policy using the following template and attach it to an IAM role cross_account_replication in the source account. In this particular use-case, S3 Bucket key is not enabled and therefore we are using S3 object ARN for the encryption context. Update everything in red as appropriate to suit your needs.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "s3:GetReplicationConfiguration",
            "s3:ListBucket"
         ],
         "Resource":[
            "arn:aws:s3:::source-bucket"
         ]
      },
      {
         "Effect":"Allow",
         "Action":[
            "s3:GetObjectVersionForReplication",
            "s3:GetObjectVersionAcl"
         ],
         "Resource":[
            "arn:aws:s3:::source-bucket/*"
         ]
      },
      {
         "Effect":"Allow",
         "Action":[
            "s3:ReplicateObject",
            "s3:ReplicateDelete"
         ],
         "Resource":"arn:aws:s3:::destination-bucket/*"
      },
      {
         "Action":[
            "kms:Decrypt"
         ],
         "Effect":"Allow",
         "Condition":{
            "StringLike":{
               "kms:ViaService":"s3.source-bucket-region.amazonaws.com",
               "kms:EncryptionContext:aws:s3:arn":[
                  "arn:aws:s3:::source-bucket-name/*"
               ]
            }
         },
         "Resource":[
           "arn:aws:kms:us-east-1:123456789101:key/id(a)from source account"
         ]
      },
      {
         "Action":[
            "kms:Encrypt"
         ],
         "Effect":"Allow",
         "Condition":{
            "StringLike":{
               "kms:ViaService":"s3.destination-bucket-region.amazonaws.com",
               "kms:EncryptionContext:aws:s3:arn":[
                  "arn:aws:s3:::destination-bucket-name/*"
               ]
            }
         },
         "Resource":[
            "arn:aws:kms:us-east-1:123456789102:key/id(b)from destination account"
         ]
      }
   ]
}
  1. Creating IAM role s3_batch_operations

Now, let’s create an IAM policy to replicate existing objects and attach it to an IAM role s3_batch_operations in the source account. Update everything in red as appropriate to suit your needs.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Action":[
            "s3:InitiateReplication"
         ],
         "Effect":"Allow",
         "Resource":[
            "arn:aws:s3:::replication-source-bucket-name/*"
         ]
      },
      {
         "Action":[
            "s3:GetReplicationConfiguration",
            "s3:PutInventoryConfiguration"
         ],
         "Effect":"Allow",
         "Resource":[
            "arn:aws:s3:::replication-source-bucket-name"
         ]
      },
      {
         "Effect":"Allow",
         "Action":[
            "s3:PutObject"
         ],
         "Resource":[
            "arn:aws:s3:::completionreport-bucket-name/*"
         ]
      }
   ]
}

In AWS account B

  1. Updating S3 bucket policy

Now, add the following bucket policy on the destination S3 bucket for Account A cross_account_replication IAM role. Update everything in red as appropriate to suite your needs.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Permissions on objects",
            "Effect": "Allow",
            "Principal": {
                "AWS": "ARN of cross_account_replication IAM role from source account"
            },
            "Action": [
                "s3:ReplicateDelete",
                "s3:ReplicateObject"
            ],
            "Resource": "arn:aws:s3:::destination-bucket-name/*"
        },
        {
            "Sid": "Permissions on bucket",
            "Effect": "Allow",
            "Principal": {
                "AWS": "ARN of cross_account_replication IAM role from source account"
            },
            "Action": [
                "s3:List*",
                "s3:GetBucketVersioning",
                "s3:PutBucketVersioning"
            ],
            "Resource": "arn:aws:s3:::destination-bucket-name"
        }
    ]
}
    1. Updating SSE-KMS policy

In our use case, the destination bucket is in a different AWS account. Therefore, we must make sure to use an AWS KMS customer managed key, which is owned by the destination account. If not, then the destination account can’t access the objects in the destination bucket. To use an SSE-KMS that belongs to the destination account to encrypt the destination objects, the destination account must update the SSE-KMS policy with permissions for the replication role in Account A. Now, let’s update the SSE-KMS (b) policy with permissions for Account A cross_account_replication IAM role.

{
    "Sid": "S3ReplicationSourceRoleToUseTheKey",
    "Effect": "Allow",
    "Principal": {
        "AWS": "ARN of cross_account_replication IAM role from source account"
    },
    "Action": ["kms:GenerateDataKey", "kms:Encrypt"],
    "Resource": "*"
}

To edit the SSE-KMS policy, navigate to the AWS KMS console. From the list of SSE-KMS, choose the alias or key ID of the SSE-KMS that you want to update. Select the Key policy tab and then edit the policy.

The following screenshot illustrates how to edit an SSE-KMS policy.

In AWS account A

  1. Creating S3 replication rule

Amazon S3 Replication is a managed, low cost, elastic solution for replicating objects from one S3 bucket to another. You can use Amazon S3 API or the S3 console to create a replication configuration rule.

Sign in to the AWS Management Console and open the Amazon S3 console. From the list of S3 buckets, select the source bucket. Choose the tab Management, scroll down to Replication rules, and then choose Create replication rule.

The following screenshot illustrate the creation of a Replication rule.

Next, specify the destination AWS Account ID and S3 Bucket name.

Then, choose the IAM role and the SSE-KMS key used for replication.

The following screenshot illustrate the IAM role and SSE-KMS used for replication.

Then, select the Destination storage class for the objects.

The following screenshot illustrate the destination S3 bucket storage class.

Finally, the replication rule will look similar to the following screenshot:

The following screenshot illustrate a sample Replication rule that will replicate objects encrypted with SSE-KMS.

Alternatively, you also have the option to use Amazon S3 API put-bucket-replication to create a replication configuration rule as follows.

aws s3api put-bucket-replication \
    --bucket source-bucket-name \
    --replication-configuration file://replication-sample.json

The following is a sample replication-sample.json file.

{
        "Rules": [
            {
                "Status": "Enabled",
                "Filter": {},
                "SourceSelectionCriteria": {
                    "SseKmsEncryptedObjects": {
                        "Status": "Enabled"
                    }
                },
                "DeleteMarkerReplication": {
                    "Status": "Disabled"
                },
                "Destination": {
                    "EncryptionConfiguration": {
                        "ReplicaKmsKeyID": "ARN of AWS KMS Key(b)from destination account"
                    },
                    "Account": "AWS Account B #",
                    "Bucket": "arn:aws:s3:::destination-bucket-name",
                    "AccessControlTranslation": {
                        "Owner": "Destination"
                    }
                },
                "Priority": 0,
                "ID": "Cross-Region-Replication-rule"
            }
        ],
        "Role": "ARN of cross_account_replication IAM role from source account"
}
  1. Creating S3 Batch Operations job

You can use Amazon S3 API or Amazon S3 console to create a Batch Operations job.

Using the Amazon S3 console, this is how you create a Batch Operations job. Select your AWS Region, Create manifest using S3 Replication configuration, and provide your Source bucket. Remember to use the IAM role s3_batch_operations created in Step 2 while creating the job.

The following screenshot illustrate the creation of a Batch Operations job.Then, select the replication Operation type.

The following screenshot illustrate selecting Replication as the option.

Next, configure the additional options required for the Batch Operations job, like the Description, Priority, and Completion report.

he following screenshot illustrate configuring additional options.

Next, specify the IAM role used by the S3 Batch Operations job.

Screenshot illustrates the IAM role used by the Batch Operations job.

Finally, review your selections and Create job.

The following screenshots illustrate Create Job.

Once the Batch Operation job is created, select the job ID and chose Run Job from the Batch Operations console.

Alternatively, you can use Amazon S3 API to create a Batch Operations job as follows. Update everything highlighted to suite your needs.

aws s3control create-job \
--account-id 111122223333 \
--operation '{"S3ReplicateObject":{}}' \
--report '{"Bucket":"arn:aws:s3:::***","Prefix":"batch-replication-report", "Format":"Report_CSV_20180820","Enabled":true,"ReportScope":"AllTasks"}' \
--manifest-generator '{"S3JobManifestGenerator": {"ExpectedBucketOwner": "111122223333", "SourceBucket": "arn:aws:s3:::***", "EnableManifestOutput": false, "Filter": {"EligibleForReplication": true, "ObjectReplicationStatuses": ["NONE","FAILED"]}}}' \
--role-arn arn:aws:iam::111122223333:role/batch-Replication-IAM-policy \
--region source-bucket-region \
--priority 1 \ 
--no-confirmation-required

Validation

Once the job is started, you can navigate to the S3 Batch Operations page to see the status of the job, the percentage of files that have been replicated, and the total number of files that have failed the replication. S3 Batch Operations generates a report for jobs that have completed, failed, or been canceled. The completion report contains additional information for each task, including the object key name and version, status, error codes, and descriptions of any errors. You could use this to validate the success of your S3 Batch Operations job.

The following screenshot illustrates a sample S3 Batch Operations job status.

Alternatively, you can use the Amazon S3 API describe-job to validate the status.

aws s3control describe-job \
  --account-id 123456789012 \
  --job-id 93735294-df46-44d5-8638-6356f335324e

Cleaning up

To avoid ongoing charges in your AWS account, you should delete the AWS resources listed in the prerequisites section of this post. Furthermore, log in to the AWS Management Console and delete any manually created resources.

Conclusion

In this post, we covered using Amazon S3 Batch Operations to migrate new and existing S3 objects encrypted with SSE-KMS keys when the source and destination S3 buckets are owned by different AWS accounts in the same AWS Region.

Companies of any size can use Amazon S3 Batch Operations to perform large-scale replication on S3 objects using automation, saving them time and money. With the solution in this post, you can seamlessly migrate encrypted objects to satisfy latency, compliance, and disaster recovery requirements.

For further reading, refer to AWS Well-Architected Framework, Architecture Best Practices for Storage, and AWS Storage Optimization. We are here to help, and if you need further assistance in developing a successful cloud storage optimization strategy, reach out to AWS Support and your AWS account team.

Arun Chandapillai

Arun Chandapillai

Arun Chandapillai is a Senior Cloud Architect who is a diversity and inclusion champion. He is passionate about helping his Customers accelerate IT modernization through business-first Cloud adoption strategies and successfully build, deploy, and manage applications and infrastructure in the Cloud. Arun is an automotive enthusiast, an avid speaker, and a philanthropist who believes in ‘you get (back) what you give’.

Parag Nagwekar

Parag Nagwekar

Parag Nagwekar is a Senior Cloud Infrastructure Architect with AWS Proserv. He works with customers to design and build solutions to help their journey to cloud and IT modernization. His passion is to provide scalable and highly available services for applications in the cloud. He has experience in multiple disciplines including system administration, DevOps, distributed architecture, and cloud architecting.

Shak Kathir

Shak Kathir

Shak Kathirvel is Senior Cloud Application Architect with AWS ProServe. He enjoys working with customers and helping them with Application Modernization and Optimization efforts, guide their Enterprise Cloud management and Governance strategies and migrate their workloads to the cloud. He is passionate about Enterprise architecture, Serverless technologies and AWS cost and usage optimization. He loves his job for its challenges and the opportunity to work with inspiring customers and colleagues.