AWS Compute Blog
Using dynamic Amazon S3 event handling with Amazon EventBridge
Update Nov 29, 2021 – Amazon S3 can now send event notifications directly to Amazon EventBridge. For more information, read this News Blog post.
A common pattern in serverless applications is to invoke a Lambda function in response to an event from Amazon S3. For example, you could use this pattern for automating document translation, transcribing audio files, or staging data imports. You can configure this integration in many places, including the AWS Management Console, the AWS CLI, or the AWS Serverless Application Model (SAM).
If you need to fan out notifications, or hold messages in queue, you are also able to route S3 events to Amazon SNS or Amazon SQS. These standard notification mechanisms work well for most applications, and are simple to implement. However, for more complex notification patterns, you can use Amazon EventBridge to route events dynamically. This blog post explores advanced use-cases and how to implement these in your serverless applications.
To set up the example applications, visit the GitHub repo and follow the instructions in the README.md file. The code uses SAM templates, enabling you to deploy the applications in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost if you test with large amounts of data.
Integrating S3 events with Lambda via EventBridge
EventBridge consumes S3 events via AWS CloudTrail. A single trail can log events for one or more S3 buckets, and you can configure which data events are recorded. It’s best practice to store CloudTrail log files in a separate S3 bucket. Once this is configured, EventBridge can then receive any event logged in the trail.
The first example in the GitHub repo shows how this can be configured in a SAM template. The application comprises an S3 bucket, a Lambda EventConsumer function, and other required resources. First, the template defines the two buckets:
Resources:
SourceBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: "TheSourceBucket"
LoggingBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: "TheLoggingBucket"
Next, an S3 bucket policy grants permissions for CloudTrail to write files to the logging bucket:
BucketPolicy:
Type: AWS::S3::BucketPolicy
Properties:
Bucket:
Ref: LoggingBucket
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Sid: "AWSCloudTrailAclCheck"
Effect: "Allow"
Principal:
Service: "cloudtrail.amazonaws.com"
Action: "s3:GetBucketAcl"
Resource:
!Sub |-
arn:aws:s3:::${LoggingBucket}
-
Sid: "AWSCloudTrailWrite"
Effect: "Allow"
Principal:
Service: "cloudtrail.amazonaws.com"
Action: "s3:PutObject"
Resource:
!Sub |-
arn:aws:s3:::${LoggingBucket}/AWSLogs/${AWS::AccountId}/*
Condition:
StringEquals:
s3:x-amz-acl: "bucket-owner-full-control"
The template configures the trail and sets the logging bucket. It defines event selectors, which identify the specific events for logging:
myTrail:
Type: AWS::CloudTrail::Trail
DependsOn:
- BucketPolicy
Properties:
TrailName: "MyTrailName"
S3BucketName:
Ref: LoggingBucket
IsLogging: true
IsMultiRegionTrail: false
EventSelectors:
- IncludeManagementEvents: false
DataResources:
- Type: AWS::S3::Object
Values:
- !Sub |-
arn:aws:s3:::${SourceBucket}/
IncludeGlobalServiceEvents: false
The SAM template configures a target Lambda function for receiving the events:
EventConsumerFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: eventConsumer/
Handler: app.handler
Runtime: nodejs12.x
Finally, it defines a rule that sets the event pattern and targets. It also grants permission to EventBridge to invoke the Lambda function:
EventRule:
Type: AWS::Events::Rule
Properties:
Description: "EventRule"
State: "ENABLED"
EventPattern:
source:
- "aws.s3"
detail:
eventName:
- "PutObject"
requestParameters:
bucketName: !Ref SourceBucketName
Targets:
-
Arn:
Fn::GetAtt:
- "EventConsumerFunction"
- "Arn"
Id: "EventConsumerFunctionTarget"
PermissionForEventsToInvokeLambda:
Type: AWS::Lambda::Permission
Properties:
FunctionName:
Ref: "EventConsumerFunction"
Action: "lambda:InvokeFunction"
Principal: "events.amazonaws.com"
SourceArn:
Fn::GetAtt:
- "EventRule"
- "Arn"
To deploy this application, follow the instructions in the GitHub repo’s README.file. To test, upload any file to the Source Bucket. This invokes the Lambda function via the EventBridge event, and logs out the event details. Open the CloudWatch Logs console for the deployed Lambda function to view the output.
The event pattern in this example matches on any PutObject event in the Source Bucket. You can also match on any attribute, or combination of attributes, in an S3 event. This makes it possible to identify events by source IP address, object size, time range, or principalId (the user causing the event). With access to the entire S3 event, this enables more granularity on matching events before invoking the target Lambda function.
Consuming events from existing S3 buckets
When deploying S3 and Lambda integrations in SAM templates, you cannot use existing buckets managed outside of the CloudFormation stack. Frequently, it’s useful to deploy serverless applications that integrate with existing S3 buckets. Using the S3-to-EventBridge integration, you can create new applications that receive events from existing buckets.
The second example in the GitHub repo shows how to configure a new application for an existing bucket. This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions.
Follow this example’s README.md file to deploy the application. To test, upload any file into the existing S3 bucket you selected. This invokes the eventConsumer logging function deployed in the template.
Invoking a single Lambda function from multiple S3 buckets
With EventBridge decoupling the producer and consumer of the events, this also makes it easier to introduce multiple producers. In the third example, the SAM template creates three buckets that invoke the same EventConsumer Lambda function:
The MultiBucketName parameter is used to create the three buckets with a number appended to the name. First, the CloudTrail EventSelector includes the three buckets in the trail:
# The CloudTrail trail
myTrail:
Type: AWS::CloudTrail::Trail
DependsOn:
- BucketPolicy
Properties:
TrailName: "myTrail"
S3BucketName:
Ref: LoggingBucket
IsLogging: true
IsMultiRegionTrail: false
EventSelectors:
- IncludeManagementEvents: false
DataResources:
- Type: AWS::S3::Object
Values:
- !Sub 'arn:aws:s3:::${MultiBucketName}-1/'
- !Sub 'arn:aws:s3:::${MultiBucketName}-2/'
- !Sub 'arn:aws:s3:::${MultiBucketName}-3/'
IncludeGlobalServiceEvents: false
Next, the EventRule includes the three bucket names in the event pattern, so events from any of these buckets can now trigger the rule:
# EventBridge rule - invokes EventConsumerFunction
EventRule:
Type: AWS::Events::Rule
Properties:
Description: "EventRule"
State: "ENABLED"
EventPattern:
source:
- "aws.s3"
detail:
eventName:
- "PutObject"
requestParameters:
bucketName:
- !Sub '${MultiBucketName}-1'
- !Sub '${MultiBucketName}-2'
- !Sub '${MultiBucketName}-3'
It’s also possible to use content-based filtering in event patterns to match dynamically on bucket names. For example, if you have multiple buckets with the prefix myCompanySales, you can create an event pattern to match all of these buckets:
EventPattern:
source:
- "aws.s3"
detail:
eventName:
- "PutObject"
requestParameters:
bucketName:
- "prefix": "myCompanySales"
This enables your application to consume events from new buckets created after the application is deployed. With content-based filtering, you can create search patterns that allow greater flexibility in matching events.
Multiple buckets with multiple Lambda functions
In the standard S3 and Lambda integration, a single Lambda function can only be invoked by distinct prefix and suffix patterns in the S3 trigger. This means that the same Lambda function cannot be set as the trigger for PutObject events for the same filetype or prefix. When you need to invoke multiple functions with the same or overlapping prefixes or suffixes, the EventBridge integration can handle this.
EventBridge allows up to five targets per rule, so you can specify up to five separate Lambda functions to receive the event. All five functions are invoked in parallel when the event pattern matches. To use this, add the targets in the rule – no change to the event pattern is required.
In the fourth example, the SAM template configures three buckets and three Lambda functions, all subscribing to the same event pattern.
This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions. The key change to the template is in the EventRule, where now more than one target is defined:
Targets:
- Arn:
Fn::GetAtt:
- "EventConsumerFunction1"
- "Arn"
Id: "EventConsumerFunctionTarget1"
- Arn:
Fn::GetAtt:
- "EventConsumerFunction2"
- "Arn"
Id: "EventConsumerFunctionTarget2"
- Arn:
Fn::GetAtt:
- "EventConsumerFunction3"
- "Arn"
Id: "EventConsumerFunctionTarget3"
This approach enables more complex routing of S3 events to Lambda targets. It allows events from multiple S3 buckets with overlapping prefixes and suffixes in object names. It also enables you to route those events to multiple Lambda functions simultaneously.
Conclusion
The standard S3 to Lambda integration enables developers to deploy code that responds to bucket- or object-based events. You can also use SNS or SQS as targets for fanning out or buffering messages from S3. Using Amazon EventBridge, you can employ even more sophisticated routing and filtering of events between S3 and Lambda.
In this blog post, I show how to deploy a basic integration using a SAM template with a single bucket and single Lambda function. I cover how to use existing S3 buckets in your new application deployments, and use EventBridge content filtering in rules to dynamically match bucket events.
Finally, in complex serverless applications, I show how EventBridge completely decouples the producers and consumers. This makes it easy to route events from multiple S3 buckets to multiple Lambda functions. When combined with attribute matching across the entire S3 event object, this allows much more granularity in identifying events before invoking Lambda functions.
To learn more about using decoupled, event-driven architectures in your serverless applications, visit the Amazon EventBridge Learning Path.