亚马逊AWS官方博客

Deploy elastic bastion hosts in one-click for secure session management and port forwarding with Cloud Foundations丨借助 Cloud Foundations 一键部署弹性堡垒机安全合规地实现会话管理及端口转发

The Chinese version[1] of this blog post was originally published on September 14, 2023. We updated the network and product definitions based on the latest specifications when translating and republishing it in English.

A bastion host, also known as a jump box, is an instance in external network that provides a unique access to resources within a private network. For security compliance purposes, enterprises usually have very strict restrictions and regulations on network ingress and egress. Workloads are generally not accessible through the Internet. Bastion hosts are a good bridge to solve this problem, i.e., maintaining a secure isolation of workloads from the Internet, while meeting the requirements of DevOps personnel to have controlled access to them. In addition to convenient access, a mature bastion architecture also has the following functions, such as recording audit logs, encrypting sessions and logs, and finer permission management. In the context of a modern cloud environment with multiple accounts and regions, there are higher requirements and more challenges in bastion host architecture design.

Amazon Web Services (AWS) provides many bastion building solutions, such as the “Linux bastion host on AWS” solution, which provides an Amazon CloudFormation template to help deploy elastic bastion hosts in existing or new VPCs. The article “Accessing instances using bastions” discusses common problems and cross-operating system access methods for accessing an instance in a single-account or multi-account landing zone. The “Access a bastion host by using Session Manager and Amazon EC2 Instance Connect” wizard shows the detailed steps for safely accessing a bastion host in a single-account environment through the instance connect function. A recent article “Bastion host design and automated implementation based on Amazon EC2 and Amazon Systems Manager Session Manager” analyzes and comprehensively summarizes the advantages and features of bastion hosts, and presents a conceptual verification of database connectivity. These articles are all very inspiring, but they also have quite a learning curve and implementation difficulty, especially considering adapting to a multi-account environment on the cloud.

Based on the cloud multi-account environment[2] built by Cloud Foundations, we will gradually show how to quickly build a stable and resilient bastion host architecture in a safe and compliant manner to achieve session management and port forwarding, guided by actual implementation. You can use this as a starting point to optimally build a bastion host solution that is closer to your business needs.

Overall architecture design

The following figure shows an elastic bastion host architecture built on the Cloud Foundations’ multi-account organizational structure. Overall:

  • The bastion instance is placed in the Secure Account, managed by the autoscaling group to provide peak and valley scaling services and auto-switch on or off based on working hours;
  • Bastion host products such as security groups, launch templates, autoscaling groups and their policies described above are predefined in the Infrastructure Account;
  • Cloud Foundations basic landing zone configures session management preferences:
    • Sessions are encrypted by a customer managed key (CMK) in the Security Account;
    • Sessions are sent to the same account’s CloudWatch log group and SSM logs bucket in the Logs Account;
    • Session idle time and maximum session time;
    • Enable a non-default username to log in to the bastion host;
  • The SSM CMK described above is managed centrally in the Secure Account, and SSM log bucket is managed centrally in the Logs Account;
  • Cloud Foundations VPC-sharing network connectivity product deploys network resources in the Network Account:
    • Create VPCs for security and the workloads and share the subnets to the corresponding accounts;
    • Create the endpoint VPC using the AWS PrivateLink technology (no internet exposure);

Deployment steps

The main steps for deploying network resources definition are:

  1. Define the network structure JSON content in the network profile of the system application configuration in the Infrastructure Account;
  2. Create a VPC-sharing network connectivity product through the Pipeline Factory in the Infrastructure Account, and launch the product to provision the network pipeline;
  3. Release the network pipeline to deploy network-related resources to the Network Account, share subnets to member accounts, and synchronize resource tags with one-click;

Bastion hosts and database resources other than network resources are defined and deployed through the Cloud Foundations’ Product Factory. The main steps are:

  1. (Bastion host) Create the predefined bastion host product through Product Factory in the Infrastructure Account, and launch the product to provision a bastion host pipeline;
  2. (Bastion host) Release the bastion host pipeline to deploy bastion host-related resources with one-click, such as security groups, launch templates, and autoscaling groups;
  3. (Database) Define the JSON content of the database product in the Infrastructure Account product application configuration;
  4. (Database) Create a database product through the Product Factory in the Infrastructure Account, and launch the product to provision a database pipeline;
  5. (Database) Release the database pipeline to deploy database-related resources with one-click, such as security groups, subnet groups, and a database instance;

Main functions

As far as the bastion host architecture described above is concerned, you can achieve the following main functions:

  • Log in to the bastion host on the private subnet of the Secure Account without an SSH key. Connect to the bastion host through the session manager instead of the Internet. There are mainly three ways:
    1. Through the AWS command line tool;
    2. Through the EC2 console
    3. Through the SSM Session Manager console;
  • Configure database port forwarding on the bastion host to directly log in to the database instance in the workload account’s private subnet locally under the cloud;

Network design

The network structure uses the VPC-sharing connectivity mode. Use Amazon PrivateLink to access AWS services such as Systems Manager, Key Management Service, Logs, etc. through the AWS private network without internet access to improve security. Use the endpoint VPC to centrally manage interface endpoints to save costs. Use the free gateway endpoint to access S3. Transit gateway route tables include separate tables for security and workload accounts to facilitate subsequent network structure expansion.

You may design network in another way, such as using Internet gateway to access AWS services instead of interface endpoints. The network definition is as follows:

{
  "vpcs": {
    "hub": {
      "is_endpoint": true,
      "cidr": "10.0.0.0/20",
      "endpoints": ["ssm", "ssmmessages", "ec2messages", "kms", "logs"],
      "subnets": [[[8,0], [8,1]], [], [[4,2], [4,3]]]
    },
    "security": {
      "cidr": "10.0.16.0/20",
      "accounts": ["$.account.security"],
      "gw_endpoints": ["s3"],
      "subnets": [[[8,0], [8,1]], [], [[4,2], [4,3]]]
    },
    "workload": {
      "cidr": "10.0.32.0/20",
      "accounts": ["WORKLOAD_ACCOUNT"],
      "gw_endpoints": ["s3"],
      "subnets": [[[8,0], [8,1]], [], [[4,2], [4,3]], [[4,4], [4,5]]]
    }
  },
  "tgw": {
    "enabled": true,
    "cidr": "10.0.0.0/16",
    "tables": {
      "pre": {
        "associations": ["security", "workload"], "routes": {"*": "hub"}
      },
      "post": {
        "associations": ["hub"], "propagations": ["security", "workload"]
      }
    }
  }
}

The basic structure of the network definition is shown in the overall architecture diagram. The first VPC is the endpoint VPC (lines 4 – 7), which centrally manages the interface endpoints. The latter two VPCs each share subnets to the security (lines 9-13) and workload accounts (lines 15 – 19). In terms of connectivity, the two are interconnected via the hub VPC (lines 23 – 30). Follow the steps above to deploy network resources, release the pipeline twice, and confirm that the network resource tags are synchronized to member accounts. The above network structure is slightly different from the previous blog’s definition specification[3]. The network structure is defined according to Cloud Foundations’ specifications after subsequent optimization and simplification.

Product resource definition

Cloud Foundations has deployed the following resources in advance when deploying the basic landing zone:

  • Public cloud environment resources: SSM CMK in the Security Account, encrypted SSM logs bucket in the Logs Account;
  • Specific resources by account: SSM Session Manager preferences, encrypted CloudWatch session log group;
  • Predefined bastion host product: the bastion host profile is pre-defined in the product application of the Infrastructure Account, which can be used right out of the box;

The following describes the predefined bastion host product and the database resources required to complete this post’s proof of concept.

Predefined bastion host

The predefined bastion host product mainly includes the following resources:

  1. (IAM) Configures the instance profile and bastion login-session role for the bastion host based on the principle of least privilege. In particular, set the login username by the SSMSessionRunAs tag, cross-validated with the operating system username;
  2. (VPC) Bastion host security groups that allow access to interface endpoints and common databases;
  3. (EC2) Instance launch template, autoscaling group, and its policies and scheduled tasks. For stateless bastion hosts, specify the use of spot instances to reduce costs;

Definition example:

[
  [
    {
      "service": "iam",
      "accounts": ["$.account.security"],
      "roles": {
        "bastion":       { "template": "ssm-ec2" },
        "bastion-user":  { "template": "ssm-session-user" }
      }
    },
    {
      "service": "vpc",
      "accounts": ["$.account.security"],
      "groups": {
        "bastion": {
          "out": [
            { "cidr": "*", "protocol": "icmp" },
            { "cidr": "*", "protocol": "https" },
            { "cidr": "*", "protocol": "mssql" },
            { "cidr": "*", "protocol": "mysql" },
            { "cidr": "*", "protocol": "postgres" },
            { "cidr": "*", "protocol": "oracle" }
          ]
        }
      }
    }
  ],
  [
    {
      "service": "ec2",
      "accounts": ["$.account.security"],
      "templates": {
        "bastion": {
          "role": "$.bastion",
          "vpc": { "groups": "$.bastion" },
          "spot": true,
          "volumes": [{ "size": 20 }],
          "user_data": ["#!/bin/bash", "useradd --create-home ${param_prefix}"]
        }
      }
    }
  ],
  [
    {
      "service": "autoscaling",
      "accounts": ["$.account.security"],
      "groups": {
        "bastion": {
          "size": [1, 1, 2],
          "vpc": { "subnets": "$.private" },
          "ec2_template": "$.bastion",
          "policy": { "type": "target", "metric": "cpu", "value": 50 },
          "schedules": [
            { "size": [1, 1, 2], "cron": "0 8 * * *" },
            { "size": [0, 0, 0], "cron": "0 20 * * *" }
          ]
        }
      }
    }
  ]
]

The above definition builds an autoscaling group bastion host in three stages according to dependencies:

  1. This stage builds independent roles (lines 4 – 8) and security groups (lines 12 – 22) in parallel. Among them, roles reuse the standard predefined templates, simple and safe;
  2. This stage builds a launch template and configures the operating system login username (lines 30 – 38), where the instance profile (line 34) and security group (line 35) refer to the first stage content;
  3. This stage builds an autoscaling group (lines 45 – 55), where the launch template (line 50) refers to the second stage content, and the subnets refer to the private subnets shared to the Security Account as described above;

You can fine-tune these parameters according to business requirements, such as instance type, login username, startup time, scaling trigger conditions, etc.

Database resources

To test port forwarding by bastion hosts which enables direct connection to database located in a private subnet on the cloud from a local network, define and deploy the following database resources:

  1. Basic database resources: the monitoring role, security groups, and subnet groups;
  2. The database instance and its configuration;

Definition example:

[
  [
    {
      "service": "rds",
      "accounts": ["${STAGE}"],
      "groups": { "db": { "subnets": "$.private-main-workload-sn1" }},
      "settings": { "bastion": { "monitor_role": true } }
    },
    {
      "service": "vpc",
      "accounts": ["${STAGE}"],
      "groups": {
        "rds-mysql": { "in": [{ "protocol": "mysql", "cidr": "10.0.0.0/16" }] }
      }
    }
  ],
  [
    {
      "service": "rds",
      "accounts": ["${STAGE}"],
      "instances": {
        "mysql": {
          "engine": { "name": "mysql", "version": "8.0.33" },
          "volume": { "size": 20, "max": 100 },
          "group": "$.db",
          "vpc": { "groups": "$.bastion" },
          "options": { "MARIADB_AUDIT_PLUGIN": {} },
          "parameters": { "innodb_log_file_size": "536870912" }
        }
      }
    }
  ]
]

The above definition builds an autoscaling group bastion host in two stages according to dependencies:

  1. This phase builds the independent monitoring role (lines 7), DB subnet group (lines 6), and DB security groups (lines 10 – 13) in parallel. Among them, the DB subnet group refers to the private subnets shared to the workload account as described above;
  2. This phase builds the DB instance (lines 19 – 28), where the subnet group (line 25) and security group (line 26) refer the first stage cross-block;

You can also fine-tune the parameters to build other DB instances based on the above definitions according to business requirements, such as database type and version, instance type, volume capacity, option groups, and parameter groups.

Predefined role definition templates

The bastion host profile and the bastion host role definition described above only require one attribute because they refer to the predefined templates, which is simple and convenient. More importantly, predefined templates are defined in advance according to best practices and safety compliance requirements, therefore the overall level of safety standards of the system is further enhanced, a virtuous cycle for each other.

Definition example:

{
  "role": {
    "ssm-ec2": {
      "trusts": { "default": ["ec2"] },
      "aws_policies": {
        "default": [ 
           "AmazonSSMManagedInstanceCore", 
           "AWSEC2VssSnapshotPolicy", 
           "CloudWatchAgentServerPolicy"
        ]
      },
      "statements": {
        "default": [
          { "actions": ["s3:GetEncryptionConfiguration"] },
          { "actions": ["s3:PutObject"], "resources": ["$.bucket.ssm/*"]},
          {
            "actions":   ["kms:Decrypt", "kms:GenerateDataKey"],
            "resources": ["$.key.ssm"]
          }
        ]
      }
    },
    "ssm-session-user": {
      "trusts": { "default": ["$.account.management", "$.account.security"] },
      "statements": {
        "default": [{
            "actions": ["ssm:StartSession", "ssm:SendCommand"],
            "resources": [
              "$.arn.ec2:instance/*",
              "$.arn.ssm:document/SSM-SessionManagerRunShell",
              "arn:${PREFIX}:ssm:${REGION}::document/AWS-StartPortForwardingSessionToRemoteHost"
            ], 
            "conditions": {
              "BoolIfExists": { "ssm:SessionDocumentAccessCheck": ["true"] }
            }
          },
          {
            "actions": [
              "ssm:DescribeSessions",
              "ssm:GetConnectionStatus",
              "ssm:DescribeInstanceInformation",
              "ssm:DescribeInstanceProperties",
              "ec2:DescribeInstances"
            ] 
          },
          {
            "actions": ["ssm:TerminateSession", "ssm:ResumeSession"],
            "resources": ["arn:${PARTITION}:ssm:*:*:session/${aws:userid}-*"]
          },
          { "actions": ["kms:GenerateDataKey"], "resources": ["$.key.ssm"] }
      ]},
      "tags": { "default": {"SSMSessionRunAs": "$(PREFIX)"} }
    }
  }
} 

The predefined role templates above define two roles:

  1. An instance profile (lines 3 – 18), which allows access to cross-account SSM logs bucket (lines 15) and SSM CMK (lines 17 – 18), in addition to the policies required for system session management;
  2. Session management bastion login role (lines 23 – 52), which allows access to specific session documents to open a session in an appropriate location (lines 27 – 35), and allows access to cross-account SSM CMK (lines 50) with pre-built reference;

Functional testing

Log in to the bastion host from the command line

Before logging in to the bastion host from the command line, you need to set the bastion host user temporary credentials environment variables, such as:

[ssm-session] 
role_arn=arn:aws-cn:iam::SECURITY_ACCOUNT:role/PREFIX-role-bastion-user
credential_source=Environment
role_session_name=iac

The bastion host instance ID is then dynamically obtained:

export INSTANCE_ID=$(aws ec2 describe-instances --output text --no-paginate \
    --filters "Name=tag:Name,Values=$PREFIX-group-bastion" \
    --query 'Reservations[0].Instances[0].InstanceId' --profile ssm-session)

Then log in:

$ aws ssm start-session --target $INSTANCE_ID \
    --document-name SSM-SessionManagerRunShell --profile ssm-session

Log in to the bastion host from the console

After assuming the Security Account session role, i.e. PREFIX-role-session-user, you can log in from the EC2 console or session manager console by clicking the Connect button.

Once logged in from the console, you can access other instances within the private network without SSH key or Internet gateway. Additionally, all the commands you typed are encrypted and logged as well:

The commands logs are stored in the /aws/ssm/PREFIX/session log group in the Security Account.

Set port forwarding and connect to a cloud database locally

Use the AWS StartPortForwardingSessionToRemoteHost document to forward port locally to connect database on the cloud, obtain the database instance address in the RDS console of the workload account, obtain the database password in the Secrets Manager, and then connect to the MySQL database instance on the cloud through the local port 13306:

aws ssm start-session --target $INSTANCE_ID  \
    --document-name AWS-StartPortForwardingSessionToRemoteHost \
    --parameters '{"portNumber":["3306"],"localPortNumber":["13306"],"host":["$PREFIX-rds-mysql.****.ap-southeast-1.rds.amazonaws.com"]}' --profile ssm-session

Conclusion

We introduced the basic concept of a bastion host, and demonstrated how to deploy predefined elastic bastion host product to the Security Account, various ways to login to bastion hosts, and use bastion host to forward ports to log in to cloud database instances based on the multi-account cloud environment built by Cloud Foundations. Most of the above content and resources are pre-built or predefined by Cloud Foundations, based on which you can utilize right out of the box with simple operations and build secure and compliant cloud environment and access methods. You can contact an AWS expert to learn more and help you efficiently conduct a secure business journey on the cloud.

References

  1. Blog post: 借助 Cloud Foundations 一键部署弹性堡垒机安全合规地实现会话管理及端口转发, 2023-09
  2. Blog post: Quickly build a multi-account operating environment on the cloud with security compliance and a sound architecture, 2022-10;
  3. Blog post: Use Cloud Foundations to achieve overall planning and one-click deployment of two network sharing models of multi-account organization in cloud environments, 2023-02;

本篇作者

Clement Yuan

亚马逊云科技专业服务部顾问。曾在亚马逊美国西雅图总部工作多年,就职于 Amazon Relational Database Service (RDS) 关系型数据库服务开发团队。拥有丰富的软件开发及云上运维经验。现负责业务持续性及可扩展性运行、企业应用及数据库上云和迁移、云上灾难恢复管理、云上良好架构框架等架构咨询、方案设计及项目实施工作。

叶文军

亚马逊云科技专业服务团队云架构咨询顾问。负责企业客户的云架构设计、迁移、安全和优化,云上自动化运维,容器等相关咨询服务。在云原生技术、DevOps 及私有云部署运维等方面有着丰富的经验。

刘育新

亚马逊云科技 ProServe 团队高级顾问,长期从事企业客户入云解决方案的制定和项目的实施工作。