AWS Security Blog

Trimming AWS WAF logs with Amazon Kinesis Firehose transformations

In an earlier post, Enabling serverless security analytics using AWS WAF full logs, Amazon Athena, and Amazon QuickSight, published on March 28, 2019, the authors showed you how to stream WAF logs with Amazon Kinesis Firehose for visualization using QuickSight. This approach used no filtering of the logs so that you could visualize the full data set. However, you are often only interested in seeing specific events. Or you might be looking to minimize log size to save storage costs. In this post, I show you how to apply rules in Amazon Kinesis Firehose to trim down logs. You can then apply the same visualizations you used in the previous solution.

AWS WAF is a web application firewall that supports full logging of all the web requests it inspects. For each request, AWS WAF logs the raw HTTP/S headers along with information on which AWS WAF rules were triggered. Having complete logs is useful for compliance, auditing, forensics, and troubleshooting custom and Managed Rules for AWS WAF. However, for some use cases, you might not want to log all of the requests inspected by AWS WAF. For example, to reduce the volume of logs, you might only want to log the requests blocked by AWS WAF, or you might want to remove certain HTTP header or query string parameter values from your logs. In many cases, unblocked requests are often already stored in your CloudFront access logs or web server logs and, therefore, using AWS WAF logs can result in redundant data for these requests, while logging blocked traffic can help you to identify bad actors or root cause false positives.

In this post, I’ll show you how to create an Amazon Kinesis Data Firehose stream to filter out unneeded records, so that you only retain log records for requests that were blocked by AWS WAF. From here, the logs can be stored in Amazon S3 or directed to SIEM (Security information and event management) and log analysis tools.

To simplify things, I’ll provide you with a CloudFormation template that will create the resources highlighted in the diagram below:
 

Figure 1: Solution architecture

Figure 1: Solution architecture

  1. A Kinesis Data Firehose delivery stream is used to receive log records from AWS WAF.
  2. An IAM role for the Kinesis Data Firehose delivery stream, with permissions needed to invoke Lambda and write to S3.
  3. A Lambda function used to filter out WAF records matching the default action before the records are written to S3.
  4. An IAM role for the Lambda function, with the permissions needed to create CloudWatch logs (for troubleshooting).
  5. An S3 bucket where the WAF logs will be stored.

Prerequisites and assumptions

  • In this post, I assume that the AWS WAF default action is configured to allow requests that don’t explicitly match a blocking WAF rule. So I’ll show you how to omit any records matching the WAF default action.
  • You need to already have a AWS WAF WebACL created. In this example, you’ll use a WebACL generated from the AWS WAF OWASP 10 template. For more information on deploying AWS WAF to a CloudFront or ALB resource, see the Getting Started page.

Step 1: Create a Kinesis Data Firehose delivery stream for AWS WAF logs

In this step, you’ll use the following CloudFormation template to create a Kinesis Data Firehose delivery stream that writes logs to an S3 bucket. The template also creates a Lambda function that omits AWS WAF records matching the default action.

Here’s how to launch the template:

  1. Open CloudFormation in the AWS console.
  2. For WAF deployments on Amazon CloudFront, select region US-EAST-1. Otherwise, create the stack in the same region in which your AWS WAF Web ACL is deployed.
  3. Select the Create Stack button.
  4. In the CloudFormation wizard, select Specify an Amazon S3 template URL and copy and paste the following URL into the text box, then select Next:
    https://s3.amazonaws.com/aws-security-blog-content/public/sample/TrimAWSWAFLogs/KinesisWAFDeliveryStream.yml
  5. On the options page, leave the default values and select Next.
  6. Specify the following and then select Next:
    1. Stack name: (for example, kinesis-waf-logging). Make sure to note your stack name, as you’ll need to provide it later in the walkthrough.
    2. Buffer size: This value specifies the size in MB for which Kinesis will buffer incoming records before processing.
    3. Buffer interval: This value specifies the interval in seconds for which Kinesis will buffer incoming records before processing.

    Note: Kinesis will trigger data delivery based on which buffer condition is satisfied first. This CloudFormation sets the default buffer size to 3MB and interval size to 900 seconds to match the maximum transformation buffer size and intervals which is set by this template. To learn more about Kinesis Data Firehose buffer conditions, read this documentation.

     

    Figure 2: Specify the stack name, buffer size, and buffer interval

    Figure 2: Specify the stack name, buffer size, and buffer interval

  7. Select the check box for I acknowledge that AWS CloudFormation might create IAM resources and choose Create.
  8. Wait for the template to finish creating the resources. This will take a few minutes. On the CloudFormation dashboard, the status next to your stack should say CREATE_COMPLETE.
  9. From the AWS Management Console, open Amazon Kinesis and find the Data Firehose delivery stream on the dashboard. Note that the name of the stream will start with aws-waf-logs- and end with the name of the CloudFormation. This prefix is required in order to configure AWS WAF to write logs to the Kinesis stream.
  10. From the AWS Management Console, open AWS Lambda and view the Lambda function created from the CloudFormation template. The function name should start with the Stack name from the CloudFormation template. I included the function code generated from the CloudFormation template below so you can see what’s going on.

    Note: Through CloudFormation, the code is deployed without indentation. To format it for readability, I recommend using the code formatter built into Lambda under the edit tab. This code can easily be modified for custom record filtering or transformations.

    
        'use strict';
    
        exports.handler = (event, context, callback) => {
            /* Process the list of records and drop those containing Default_Action */
            const output = event.records.map((record) => {
                const entry = (new Buffer(record.data, 'base64')).toString('utf8');
                if (!entry.match(/Default_Action/g)){
                    return {
                        recordId: record.recordId,
                        result: 'Ok',
                        data: record.data,
                    };
                } else {
                    return {
                        recordId: record.recordId,
                        result: 'Dropped',
                        data: record.data,
                    };
                }
            });
        
            console.log(`Processing completed.  Successful records ${output.length}.`);
            callback(null, { records: output });
        };        
        

You now have a Kinesis Data Firehose stream that AWS WAF can use for logging records.

Cost Considerations

This template sets the Kinesis transformation buffer size to 3MB and buffer interval to 900 seconds (the maximum values) in order to reduce the number of Lambda invocations used to process records. On average, an AWS WAF record is approximately 1-1.5KB. With a buffer size of 3MB, Kinesis will use 1 Lambda invocation per 2000-3000 records. Visit the AWS Lambda website to learn more about pricing.

Step 2: Configure AWS WAF Logging

Now that you have an active Amazon Kinesis Firehose delivery stream, you can configure your AWS WAF WebACL to turn on logging.

  1. From the AWS Management Console, open WAF & Shield.
  2. Select the WebACL for which you would like to enable logging.
  3. Select the Logging tab.
  4. Select the Enable Logging button.
  5. Next to Amazon Kinesis Data Firehose, select the stream that was created from the CloudFormation template in Step 1 (for example, aws-waf-logs-kinesis-waf-stream) and select Create.

Congratulations! Your AWS WAF WebACL is now configured to send records of requests inspected by AWS WAF to Kinesis Data Firehose. From there, records that match the default action will be dropped, and the remaining records will be stored in S3 in JSON format.

Below is a sample of the logs generated from this example. Notice that there are only blocked records in the logs.
 

Figure 3: Sample logs

Figure 3: Sample logs

Conclusion

In this blog, I’ve provided you with a CloudFormation template to generate a Kinesis Data Firehose stream that can be used to log requests blocked by AWS WAF, omitting requests matching the default action. By omitting the default action, I have reduced the number of log records that must be reviewed to identify bad actors, tune new WAF rules, and/or root cause false positives. For unblocked traffic, consider using CloudFront’s access logs with Amazon Athena or CloudWatch Logs Insights to query and analyze the data. To learn more about AWS WAF logs, read our developer guide for AWS WAF.

If you have feedback about this blog post, , please submit them in the Comments section below. If you have issues with AWS WAF, start a thread on the AWS SSO forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Tino Tran

Tino is a Senior Edge Specialized Solutions Architect based out of Florida. His main focus is to help companies deliver online content in a secure, reliable, and fast way using AWS Edge Services. He is a experienced technologist with a background in software engineering, content delivery networks, and security.