Automate Amazon S3 File Gateway deployments in VMware with Terraform by HashiCorp

Many customers have adopted an Infrastructure as Code (IaC) process within their organization to streamline and optimize provisioning infrastructure. Without an IaC practice in place, it becomes increasingly difficult to manage the scale of today’s infrastructure. IaC can help your organization manage IT infrastructure needs while also improving consistency and reducing errors and manual configuration. AWS CloudFormation is a great tool for automating the deployment of in-cloud resources. However, on-premises or hybrid infrastructure adds complexity when factoring in local networking, compute, and storage environments.

When looking to deploy an AWS Storage Gateway as a virtual machine (VM) in your on-premises virtual infrastructure, you can now use Terraform by Hashicorp. Terraform is an open-source IaC software tool that enables you to safely and predictably create, change, and improve infrastructure. Terraform can provide automation not only for in-cloud AWS resources, but also for on-premises virtual infrastructure.

In this blog, we provide a guide for deploying an Amazon S3 File Gateway using Terraform within an on-premises VMware virtual environment in a matter of minutes. We will provide you with the tools you need to setup your Terraform modules and provide a walk-through on deploying and configuring your AWS Storage Gateway with IaC. As part of the automated deployment, we will create an AWS Storage Gateway VMware VM, activate your AWS Storage Gateway within AWS, join the gateway to your Active Directory domain, create an Amazon S3 bucket, and create an SMB file share that is ready to be mounted by a client.

AWS Storage Gateway and Amazon S3 File Gateway

AWS Storage Gateway consists of four distinct gateway types, Amazon S3 File Gateway, Amazon FSx File Gateway, Tape Gateway, and Volume Gateway. Storage Gateway is a hybrid cloud storage service that is designed to be deployed next to your applications or end users to provide low latency access, most often in your own datacenter. Storage Gateway can be deployed as a VM within your VMware, Hyper-V, or Linux KVM virtual environment, as an Amazon EC2 instance within your Amazon Virtual Private Cloud (Amazon VPC), or as a pre-configured physical Hardware Appliance.

Amazon S3 File Gateway translates between Amazon S3 Object Storage and standard File Storage protocols (SMB or NFS). This is used by many applications that do not talk to Amazon S3 Object storage natively, as well as applications that need low latency access to their data as the S3 File Gateway offers local caching up to 64 TB for the most recently used data. Amazon S3 File Gateway is a great solution for a number of use cases including:

Database Backups for applications such as Microsoft SQL Server, Oracle, or SAP ASE database dumps.
Archive use cases, where you can tier your data to lower cost storage with Amazon S3 Lifecycle Policies.
Data Lakes, where the gateway can be deployed at the edge where data is generated and can ingest that data into Amazon S3 in a non-proprietary format. Then, the data can be used for downstream pipeline processing workflows.

Solution overview

We will be leveraging the Terraform AWS Storage Gateway module. The module contains the S3 File Gateway example which creates the following resources:

Storage Gateway virtual appliance on VMware
Amazon S3 File Gateway on AWS
Joins Amazon S3 File Gateway to a domain
Amazon S3 bucket
SMB file share
AWS Key Management Service (AWS KMS) key
Amazon CloudWatch log group
Amazon S3 logging bucket
AWS Identity and Access Management (IAM) roles required for Storage Gateway to use the S3 bucket

We will be using Terraform Cloud which is HashiCorp’s managed service offering. It eliminates the need for unnecessary tooling and documentation for practitioners, teams, and organizations to use Terraform in production. It allows the IaC to run remotely and also stores and manages state files that are compliant with SOC2 standards. We will also connect Terraform Cloud to a Version Control System (VCS) provider which automatically triggers runs based on changes to the VCS repositories. In our case, we will be using GitHub.

We will also leverage Terraform Cloud Agents that allow Terraform Cloud to communicate with isolated, private, or on-premises infrastructure. By deploying lightweight agents within a specific network segment, you can establish a simple connection between your environment and Terraform Cloud which allows for provisioning operations and management. This is useful for on-premises infrastructure types such as vSphere, Nutanix, OpenStack, enterprise networking providers, and anything you might have in a protected enclave.

Terraform cloud

The agent architecture is pull-based, so no inbound connectivity is required. Any agent you provision will poll Terraform Cloud for work and carry out execution of that work locally.

Terraform Cloud workspaces to VCS instances require access to the public internet. For example, you cannot use agents to connect to a GitHub Enterprise Server instance that requires access to your VPN. Terraform recommends that you leverage the information provided by the IP Ranges documentation to permit direct communication from the appropriate Terraform Cloud service to your internal infrastructure.

Note that deploying this code via Terraform Cloud is not a requirement and customers can also deploy this module using Terraform OSS (Open Source) built on their own custom pipelines. Refer to the README.md in the GitHub repository for step-by-step instructions on how to use the example using Terraform OSS. Customers are also free to use any other VCS providers supported by Terraform Cloud.

Solution walkthrough

Setup the repository
Configure Terraform Cloud Workspaces
Setup Terraform Cloud Agents
Trigger the deployment
Use the SMB file share

Prerequisites

Terraform Cloud account and organization (Business Tier required for using Cloud Agents)
Terraform v1.1+ installed locally and configured with your Terraform Cloud token
GitHub account and a development machine with git installed.
AWS account with permissions to create resources in AWS Identity and Access Management (IAM)
IAM user in your AWS account that has an AWS access key, and an AWS secret key created. See Managing access keys (console) on the AWS website.
For the IAM user, necessary permissions to create and administer:
- Amazon S3 Bucket
- AWS Storage Gateway
- Amazon CloudWatch Logs
- AWS KMS
- AWS Security Token Service
- Amazon EC2 (DescribeAccountAttributes)
- AWS IAM role and policies

See Changing permissions for an IAM user on how to setup IAM permissions and defining permission boundaries.

An environment running VMware ESXi and a service account that has the necessary permissions to create and administer a vSphere virtual machine.
Domain Credentials with the necessary permission to allow AWS Storage Gateway to join the domain.

Step 1: Setting up your repository

Sign in to your GitHub account.
Create a fork of the Terraform AWS Storage Gateway module into your own GitHub account. A copy of the module repository is needed in order for you to freely experiment with the Terraform module and also to connect your repository with Terraform Cloud.

Step 2: Setting up Terraform Cloud

1. Create a new Terraform Workspaces by clicking Workspaces to view a list of the workspaces within your organization.

Workspaces

2. Click + New Workspace. The Create a new Workspace page will appear. Choose a workflow type: In this case we are using the Version control workflow.

3. Choose an existing version control provider or configure a new one. Refer to Connecting VCS Providers for more details.

Connect to VCS

4. Choose a repository from the filterable list. The Configure settings page will appear.

Choose a repository from the filterable list

5. Enter a Workspace Name. This defaults to the repository name, if applicable. The name must be unique within the organization and can include letters, numbers, dashes (-), and underscores (_). Refer to our workspace naming recommendations. Add an optional Description that will appear at the top of the workspace in the Terraform Cloud UI.

Optional Description that will appear at the top

6. Open Advanced options to optionally configure the following settings:

- Change your Terraform Working Directory to examples/s3filegateway-vmware.
- Apply method: Can be left as default to Manual apply. This allows you to verify the result of Terraform plan before applying. You could also change this to Auto apply if you would like to skip the manual validation.

Open advanced options

- Leave the default to ‘Always trigger runs’.
- Leave the VCS branch empty. This can be changed for example if you are working in another branch such as Dev.
- Leave the ‘Include submodules on clone’ unchecked.

7. Finally Click Create workspace, then Terraform Cloud will automatically parse and discover variables

create workspace

8. Terraform Cloud will prompt at this moment to configure variables. This can also be skipped and created later. Note that in order to protect sensitive data such as domain credentials etc., certain variables are marked as sensitive. Terraform will then redact these values in the output of Terraform commands or log messages.

Variables

Also note that the domain password despite being a sensitive variable can be still found in the Terraform state file. Follow this guidance to protect state file from unauthorized access.

9. Setup the AWS provider credentials and apply them to the newly created workspace.

Terraform Cloud allows you to define input and environment variables using either workspace-specific variables, or sets of variables that you can reuse in multiple workspaces. Variable sets allow you to avoid redefining the same variables across workspaces, so you can standardize common configurations throughout your organization. One common use case for variable sets is for provider credentials. This allows you to apply variables to specific workspaces instead of explicitly declaring them again. Refer to this documentation on how to setup and apply variable sets.

It is general best practice to never store credentials and secrets in git repositories. For more information about protecting sensitive variables refer to this documentation by Terraform. Also as a best practice consider the use of services such as AWS Secrets Manager, Hashicorp Vault or Terraform Cloud to dynamically inject your secrets. In this example, we are leveraging Terraform Cloud which automatically encrypt all variable values before storing them as well as encrypts the state at rest.

Step 3: Setup Cloud Agents

1. In Terraform Cloud, create an agent pool and configure your workspaces to use agents. For step by step instructions, follow the manage private environments with Terraform Cloud Agents tutorial until you finish changing permissions for the Docker socket in the Launch the agent section.

create an agent pool and configure your workspaces to use agents.

- Once the agent container launches, verify that it has registered with the pool in the Terraform Cloud interface.

verify registration image

2. Depending on your network topology and environment, you may require perimeter networking as well as container host networking changes for an agent to function properly. It must be able to make outbound requests over HTTPS (TCP port 443) to the Terraform Cloud application APIs. These requests may require perimeter networking as well as container host networking changes, depending on your environment. Refer to the requirements page for more details.

3. Refer to the requirements page for validating supported operating system, hardware and Terraform versions to run Terraform Cloud agents.

Step 4: Trigger the deployment

Terraform Cloud has three main workflows for managing runs, and your chosen workflow determines when and how Terraform runs occur. For detailed information, see:

The UI/VCS-driven run workflow that automatically triggers runs based on changes to your VCS repositories. This is the primary mode of operation
The API-driven run workflow, which is more flexible but requires you to create some tooling.
The CLI-driven run workflow, which uses Terraform’s standard CLI tools to execute runs in Terraform Cloud.

In this example, we have configured Terraform to trigger deployments based on VCS changes. Therefore, any push to the main branch of your repository will trigger Terraform runs in the workspace. We also enabled speculative plans, which will preview the changes Terraform will make to the infrastructure when a pull request is merged.

We created a merge pull request which automatically triggered the Terraform Plan phase. Note that Terraform Cloud also lets you configure your workspace to trigger runs on changes to specific paths within your repository, or whenever you push a tag of a specified format.

Merge pull request

4. Next validate if the resources to be created and Click on Confirm & Apply. This will trigger the deployment.

Step 5 : Use the SMB file share

1. Navigate to vCenter and verify the newly created gateway appliance. Note down the IP address of the virtual machine.

ia pigeon gateway

2. Log in to the AWS Management Console and navigate to AWS Storage Gateway.

aws console

3. Click on the newly created Storage Gateway and compare the IP address noted above which maps to the Storage Gateway virtual appliance deployed on VMware.

general configuration

4. Navigate to the File shares from the menu or the left by directly clicking on file share under storage resources to find the newly created file share. Copy the net use copy command to mount the file share.

IPv4 address

5. Mount the SMB file share on your client. For more information, check out the using smb file share documentation.

command prompt

6. Your SMB file share backed by S3 File Gateway is now ready to use.

create workspace

Additional considerations

This section describes additional considerations as you use the Terraform module, including ways to modify gateway sizing. There is also additional information on versioning.

Gateway sizing

The default values of the vCPU and memory is set to 4 and 16384 (16GiB) respectively. These values are the minimum required values for a small Storage Gateway deployment. Consider increasing the values of the vCPU to 8 or 16 and the memory to 32768 (32 GiB) or 65536 (64 GiB) for a medium or a large Storage Gateway deployment. The minimum requirement for cache storage is 150 GB and the maximum is 64 TB. Typically, you will want to size your cache for the hot data set. In general, the 80/20 rule applies, where 20 percent of the data is hot, and 80 percent is cold. That means you will want to size your cache to fit the hot data set. You can always increase the cache size, and we have metrics to monitor cache utilization, so it is best to start small, and increase as needed.

The values can be changed by explicitly appending memory, cpus and cache_size attributes with their appropriate values in the vSphere module declared either in the examples/s3filegateway-vmware/main.tf or in your own custom Terraform main.tf file.

As an example :

module "vsphere" {
source     = "aws-ia/storagegateway/aws//modules/vmware-sgw"
datastore  = var.datastore
datacenter = var.datacenter
network    = var.network
cluster    = var.cluster
host       = var.host
name       = "my-s3filegateway"
memory     = "65536"
cpus       = "16"
cache_size = "300"
}

For more information on the Storage Gateway vCPU and RAM sizing and requirements, consult this documentation page. To learn more about cache sizing, refer to this documentation.

Versioning

Note that versioning is set to false by default for the S3 bucket created in this example for the SMB file share. It is because when a file is written to the File Gateway by an SMB client, the File Gateway uploads the file’s data to Amazon S3 followed by its metadata (ownerships, timestamps, and so on). Uploading the file data creates an S3 object, and uploading the metadata for the file updates the metadata for the S3 object. This process creates another version of the object, resulting in two versions of an object. If S3 Versioning is enabled, both versions are stored. Therefore enabling S3 Versioning can increase storage costs within Amazon S3. See here for further information on whether S3 Versioning is right for your workload.

Conclusion

In this blog we discussed how to leverage IaC using Terraform by HashiCorp to deploy a VMware based Amazon S3 File Gateway within minutes and at scale. We performed the following within your hybrid environment: deployed an AWS Storage Gateway VMware VM, activated your AWS Storage Gateway within AWS, joined the gateway to your Active Directory domain, created an Amazon S3 bucket, and created an SMB file share that is ready to be mounted by a client.

By automating the deployment you can speed up time to deploy, increase administrator efficiency, as well as avoid common mistakes. Here you can find the Storage Gateway Terraform Module in the Terraform Registry. You will have the ability to customize this module to suit your environment, and to aid you in scaling your gateway deployments across your organization.

For more information and to learn more about AWS Storage Gateway see the following:

Select your cookie preferences

AWS Storage Blog