AWS DevOps & Developer Productivity Blog
Fine-grained Continuous Delivery With CodePipeline and AWS Step Functions
Automating your software release process is an important step in adopting DevOps best practices. AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline was modeled after the way that the retail website Amazon.com automated software releases, and many early decisions for CodePipeline were based on the lessons learned from operating a web application at that scale.
However, while most cross-cutting best practices apply to most releases, there are also business specific requirements that are driven by domain or regulatory requirements. CodePipeline attempts to strike a balance between enforcing best practices out-of-the-box and offering enough flexibility to cover as many use-cases as possible.
To support use cases requiring fine-grained customization, we are launching today a new AWS CodePipeline action type for starting an AWS Step Functions state machine execution. Previously, accomplishing such a workflow required you to create custom integrations that marshaled data between CodePipeline and Step Functions. However, you can now start either a Standard or Express Step Functions state machine during the execution of a pipeline.
With this integration, you can do the following:
· Conditionally run an Amazon SageMaker hyper-parameter tuning job
· Write and read values from Amazon DynamoDB, as an atomic transaction, to use in later stages of the pipeline
· Run an Amazon Elastic Container Service (Amazon ECS) task until some arbitrary condition is satisfied, such as performing integration or load testing
Example Application Overview
In the following use case, you’re working on a machine learning application. This application contains both a machine learning model that your research team maintains and an inference engine that your engineering team maintains. When a new version of either the model or the engine is released, you want to release it as quickly as possible if the latency is reduced and the accuracy improves. If the latency becomes too high, you want the engineering team to review the results and decide on the approval status. If the accuracy drops below some threshold, you want the research team to review the results and decide on the approval status.
This example will assume that a CodePipeline already exists and is configured to use a CodeCommit repository as the source and builds an AWS CodeBuild project in the build stage.
The following diagram illustrates the components built in this post and how they connect to existing infrastructure.
First, create a Lambda function that uses Amazon Simple Email Service (Amazon SES) to email either the research or engineering team with the results and the opportunity for them to review it. See the following code:
import json
import os
import boto3
import base64
def lambda_handler(event, context):
email_contents = """
<html>
<body>
<p><a href="{url_base}/{token}/success">PASS</a></p>
<p><a href="{url_base}/{token}/fail">FAIL</a></p>
</body>
</html>
"""
callback_base = os.environ['URL']
token = base64.b64encode(bytes(event["token"], "utf-8")).decode("utf-8")
formatted_email = email_contents.format(url_base=callback_base, token=token)
ses_client = boto3.client('ses')
ses_client.send_email(
Source='no-reply@example.com',
Destination={
'ToAddresses': [event["team_alias"]]
},
Message={
'Subject': {
'Data': 'PLEASE REVIEW',
'Charset': 'UTF-8'
},
'Body': {
'Text': {
'Data': formatted_email,
'Charset': 'UTF-8'
},
'Html': {
'Data': formatted_email,
'Charset': 'UTF-8'
}
}
},
ReplyToAddresses=[
'no-reply+delivery@example.com',
]
)
return {}
To set up the Step Functions state machine to orchestrate the approval, use AWS CloudFormation with the following template. The Lambda function you just created is stored in the email_sender/app
directory. See the following code:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
NotifierFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: email_sender/
Handler: app.lambda_handler
Runtime: python3.7
Timeout: 30
Environment:
Variables:
URL: !Sub "https://${TaskTokenApi}.execute-api.${AWS::Region}.amazonaws.com/Prod"
Policies:
- Statement:
- Sid: SendEmail
Effect: Allow
Action:
- ses:SendEmail
Resource: '*'
MyStepFunctionsStateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
RoleArn: !GetAtt SFnRole.Arn
DefinitionString: !Sub |
{
"Comment": "A Hello World example of the Amazon States Language using Pass states",
"StartAt": "ChoiceState",
"States": {
"ChoiceState": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.accuracypct",
"NumericLessThan": 96,
"Next": "ResearchApproval"
},
{
"Variable": "$.latencyMs",
"NumericGreaterThan": 80,
"Next": "EngineeringApproval"
}
],
"Default": "SuccessState"
},
"EngineeringApproval": {
"Type":"Task",
"Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken",
"Parameters":{
"FunctionName":"${NotifierFunction.Arn}",
"Payload":{
"latency.$":"$.latencyMs",
"team_alias":"engineering@example.com",
"token.$":"$$.Task.Token"
}
},
"Catch": [ {
"ErrorEquals": ["HandledError"],
"Next": "FailState"
} ],
"Next": "SuccessState"
},
"ResearchApproval": {
"Type":"Task",
"Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken",
"Parameters":{
"FunctionName":"${NotifierFunction.Arn}",
"Payload":{
"accuracy.$":"$.accuracypct",
"team_alias":"research@example.com",
"token.$":"$$.Task.Token"
}
},
"Catch": [ {
"ErrorEquals": ["HandledError"],
"Next": "FailState"
} ],
"Next": "SuccessState"
},
"FailState": {
"Type": "Fail",
"Cause": "Invalid response.",
"Error": "Failed Approval"
},
"SuccessState": {
"Type": "Succeed"
}
}
}
TaskTokenApi:
Type: AWS::ApiGateway::RestApi
Properties:
Description: String
Name: TokenHandler
SuccessResource:
Type: AWS::ApiGateway::Resource
Properties:
ParentId: !Ref TokenResource
PathPart: "success"
RestApiId: !Ref TaskTokenApi
FailResource:
Type: AWS::ApiGateway::Resource
Properties:
ParentId: !Ref TokenResource
PathPart: "fail"
RestApiId: !Ref TaskTokenApi
TokenResource:
Type: AWS::ApiGateway::Resource
Properties:
ParentId: !GetAtt TaskTokenApi.RootResourceId
PathPart: "{token}"
RestApiId: !Ref TaskTokenApi
SuccessMethod:
Type: AWS::ApiGateway::Method
Properties:
HttpMethod: GET
ResourceId: !Ref SuccessResource
RestApiId: !Ref TaskTokenApi
AuthorizationType: NONE
MethodResponses:
- ResponseParameters:
method.response.header.Access-Control-Allow-Origin: true
StatusCode: 200
Integration:
IntegrationHttpMethod: POST
Type: AWS
Credentials: !GetAtt APIGWRole.Arn
Uri: !Sub "arn:aws:apigateway:${AWS::Region}:states:action/SendTaskSuccess"
IntegrationResponses:
- StatusCode: 200
ResponseTemplates:
application/json: |
{}
- StatusCode: 400
ResponseTemplates:
application/json: |
{"uhoh": "Spaghetti O's"}
RequestTemplates:
application/json: |
#set($token=$input.params('token'))
{
"taskToken": "$util.base64Decode($token)",
"output": "{}"
}
PassthroughBehavior: NEVER
IntegrationResponses:
- StatusCode: 200
OperationName: "TokenResponseSuccess"
FailMethod:
Type: AWS::ApiGateway::Method
Properties:
HttpMethod: GET
ResourceId: !Ref FailResource
RestApiId: !Ref TaskTokenApi
AuthorizationType: NONE
MethodResponses:
- ResponseParameters:
method.response.header.Access-Control-Allow-Origin: true
StatusCode: 200
Integration:
IntegrationHttpMethod: POST
Type: AWS
Credentials: !GetAtt APIGWRole.Arn
Uri: !Sub "arn:aws:apigateway:${AWS::Region}:states:action/SendTaskFailure"
IntegrationResponses:
- StatusCode: 200
ResponseTemplates:
application/json: |
{}
- StatusCode: 400
ResponseTemplates:
application/json: |
{"uhoh": "Spaghetti O's"}
RequestTemplates:
application/json: |
#set($token=$input.params('token'))
{
"cause": "Failed Manual Approval",
"error": "HandledError",
"output": "{}",
"taskToken": "$util.base64Decode($token)"
}
PassthroughBehavior: NEVER
IntegrationResponses:
- StatusCode: 200
OperationName: "TokenResponseFail"
APIDeployment:
Type: AWS::ApiGateway::Deployment
DependsOn:
- FailMethod
- SuccessMethod
Properties:
Description: "Prod Stage"
RestApiId:
Ref: TaskTokenApi
StageName: Prod
APIGWRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
Service:
- "apigateway.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
- PolicyName: root
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 'states:SendTaskSuccess'
- 'states:SendTaskFailure'
Resource: '*'
SFnRole:
Type: "AWS::IAM::Role"
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
Service:
- "states.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
- PolicyName: root
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 'lambda:InvokeFunction'
Resource: !GetAtt NotifierFunction.Arn
After you create the CloudFormation stack, you have a state machine, an Amazon API Gateway REST API, a Lambda function, and the roles each resource needs.
Your pipeline invokes the state machine with the load test results, which contain the accuracy and latency statistics. It decides which, if either, team to notify of the results. If the results are positive, it returns a success status without notifying either team. If a team needs to be notified, the Step Functions asynchronously invokes the Lambda function and passes in the relevant metric and the team’s email address. The Lambda function renders an email with links to the pass/fail response so the team can choose the Pass or Fail link in the email to respond to the review. You use the REST API to capture the response and send it to Step Functions to continue the state machine execution.
The following diagram illustrates the visual workflow of the approval process within the Step Functions state machine.
After you create your state machine, Lambda function, and REST API, return to CodePipeline console and add the Step Functions integration to your existing release pipeline. Complete the following steps:
- On the CodePipeline console, choose Pipelines.
- Choose your release pipeline.
- Choose Edit.
- Under the Edit:Build section, choose Add stage.
- Name your stage Release-Approval.
- Choose Save.
You return to the edit view and can see the new stage at the end of your pipeline. - In the Edit:Release-Approval section, choose Add action group.
- Add the Step Functions
StateMachine
invocation Action to the action group. Use the following settings:- For Action name, enter
CheckForRequiredApprovals
. - For Action provider, choose AWS Step Functions.
- For Region, choose the Region where your state machine is located (this post uses US West (Oregon)).
- For Input artifacts, enter
BuildOutput
(the name you gave the output artifacts in the build stage). - For State machine ARN, choose the state machine you just created.
- For Input type¸ choose File path. (This parameter tells CodePipeline to take the contents of a file and use it as the input for the state machine execution.)
- For Input, enter
results.json
(where you store the results of your load test in the build stage of the pipeline). - For Variable namespace, enter
StepFunctions
. (This parameter tells CodePipeline to store the state machine ARN and execution ARN for this event in a variable namespace named StepFunctions. ) - For Output artifacts, enter
ApprovalArtifacts
. (This parameter tells CodePipeline to store the results of this execution in an artifact called ApprovalArtifacts. )
- For Action name, enter
- Choose Done.
You return to the edit view of the pipeline.
- Choose Save.
- Choose Release change.
When the pipeline execution reaches the approval stage, it invokes the Step Functions state machine with the results emitted from your build stage. This post hard-codes the load-test results to force an engineering approval by increasing the latency (latencyMs
) above the threshold defined in the CloudFormation template (80ms
). See the following code:
{
"accuracypct": 100,
"latencyMs": 225
}
When the state machine checks the latency and sees that it’s above 80 milliseconds, it invokes the Lambda function with the engineering email address. The engineering team receives a review request email similar to the following screenshot.
If you choose PASS, you send a request to the API Gateway REST API with the Step Functions task token for the current execution, which passes the token to Step Functions with the SendTaskSuccess
command. When you return to your pipeline, you can see that the approval was processed and your change is ready for production.
Cleaning Up
When the engineering and research teams devise a solution that no longer mixes performance information from both teams into a single application, you can remove this integration by deleting the CloudFormation stack that you created and deleting the new CodePipeline stage that you added.
Conclusion
For more information about CodePipeline Actions and the Step Functions integration, see Working with Actions in CodePipeline.