AWS Cloud Operations Blog

Optimize Log Collection with Amazon CloudWatch Agent Log Filter Expressions

The Amazon CloudWatch agent is a software package that autonomously and continuously runs on your servers. You can install and configure the CloudWatch agent to collect system and application logs from Amazon Elastic Compute Cloud (EC2), on-premises hosts, and containerized applications. The logs collected by the CloudWatch agent are processed and stored in Amazon CloudWatch, which further helps with the performance and health monitoring of your infrastructure and applications.

Previously, the CloudWatch agent pushed all of the log lines from the configured log file to CloudWatch. Although this is ideal for many use cases, customers wanted a way to filter these log lines at the host to selectively include only specific log lines, or to exclude unwanted log lines or a combination of both.

In this post, we provide a brief overview of use cases for CloudWatch agent log filter expressions and walk through how to install and begin configuring the agent.

Overview

CloudWatch agent has added support for configurable log filter expressions. This new configuration option is intended for users who want to collect only log events that meet specified criteria.

Using this new functionality of the CloudWatch agent, you can specify  “include” and “exclude” regular expressions for each log stream in the agent configuration file. This will cause the agent to evaluate each log event against the expressions to determine whether the log event should be sent to CloudWatch. Log events not sent to CloudWatch are discarded by the agent. Log filters help you to manage your log ingestion by processing only log events that meet the specified criteria, such as those that contain error codes or by eliminating verbose log events.

Installing CloudWatch Agent

The agent can be installed on Linux, Windows, and other supported operating systems by downloading the agent package from Amazon Simple Storage Service (Amazon S3), using AWS Systems ManagerAWS CloudFormation, or by installing it manually using the command line.

With the agent in the Amazon Linux 2 repository, the agent package can be installed on Linux hosts in a single step using the yum package manager:

$ sudo yum install amazon-cloudwatch-agent

On Windows Server systems, installation can be conducted by downloading the agent installer from Amazon S3 and running the installer:

PS C:\> wget https://s3.amazonaws.com/amazoncloudwatch-agent/windows/amd64/latest/amazon-cloudwatch-agent.msi
PS C:\>.\ amazon-cloudwatch-agent.msi

Access to AWS resources requires permissions. Before starting the CloudWatch Agent, we must grant permissions that the agent needs to write metrics and logs to CloudWatch. Step-by-step instructions for installing the agent and granting permissions can be found in the Amazon CloudWatch Users Guide.

Getting started with configuring the agent

The agent configuration file is a JSON file that specifies the metrics and logs that the agent must collect. You can create it by using the wizard or by creating it yourself from scratch. You could also use the wizard to initially create the configuration file, and then modify it manually.

For the sake of this post, you can create or edit the CloudWatch agent configuration file manually. For simplicity in troubleshooting, we recommend that you name it /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json on a Linux server,and $Env:ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json on servers running Windows Server.

You can use the following configuration document as the agent configuration file. Furthermore, you can copy it to other servers where you want to run the agent. If you plan to use the SSM Agent to install and configure the CloudWatch agent on the other servers, then after you manually edit the CloudWatch agent configuration file, you can upload it to Systems Manager Parameter Store.

    {
    	"agent": {
    		"metrics_collection_interval": 60,
    		"run_as_user": "root"
    	},
    	"logs": {
    		"logs_collected": {
    			"files": {
    				"collect_list": [{
    						"file_path": "/var/log/audit/audit.log",
    						"log_group_name": "auditlogs",
    						"log_stream_name": "{instance_id}",
    						"filters": [{
    							"type": "include",
    							"expression": "USER_"
    						}]
    					},
    					{
    						"file_path": "/var/log/messages",
    						"log_group_name": "syslogs",
    						"log_stream_name": "{instance_id}",
    						"filters": [{
    								"type": "include",
    								"expression": "systemd:"
    							},
    							{
    								"type": "exclude",
    								"expression": "Message Of The Day"
    							}
    						]
    					},
    					{
    						"file_path": "/var/log/httpd/access_log",
    						"log_group_name": "ApplicationAccessLogs",
    						"log_stream_name": "ErrorAccessLogs.log",
    						"filters": [{
    							"type": "include",
    							"expression": "(\\\"\\s(4|5)\\d{2})"
    						}]
    					}
    				]
     
    			}
    		}
    	}
    }

Any time that you change the agent configuration file, you must restart the agent to have the changes take effect. To restart the agent, follow the instructions in Start the CloudWatch agent.

On an Amazon EC2 instance running Linux, enter the following command:

$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json

On an Amazon EC2 instance running Windows Server, enter the following from the PowerShell console:

& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c file:$Env:ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json

Understanding the agent configuration file

The following excerpt of the CloudWatch agent configuration file will signal the CloudWatch agent to publish user activity log lines only from the ‘audit.log’ file.

"collect_list": [ 
  {
    "file_path": "/var/log/audit/audit.log", 
    "log_group_name": "auditlogs", 
    "log_stream_name": "{instance_id}",
    "filters": [
      {
        "type": "include",
        "expression": "USER_"
      }
    ]
  },
  .....
]

The following are a few sample audit logs from an Amazon EC2 instance:

Screenshot of an Amazon EC2 instance terminal, displaying several examples of audit logs

Figure 1: Amazon EC2 instance Audit logs

The ‘auditlogs’ log group is created as a result of the agent configuration. The log group has all of the log lines that contain the string ‘USER_’.

Screenshot of the CloudWatch Console, displaying the ingested audit logs

Figure 2: auditlogs Log Group

The following excerpt of the CloudWatch agent configuration file will signal the CloudWatch agent to discard all of the log lines except the ones that contain the string ‘systemd:’. Moreover, it filters these log lines to exclude the ones that contain the string ‘Message Of The Day’ before publishing them to CloudWatch.

The order of the filters in the configuration file matters for performance. In the following example, the agent drops all of the logs that don’t contain ‘systemd:’ before it starts evaluating the second filter. To cause fewer log entries to be evaluated by more than one filter, put the filter that you expect to rule out more logs first in the configuration file.

“collect_list”: [ 
  {
    “file_path”: “/var/log/messages”, 
    “log_group_name”: “syslogs”, 
    “log_stream_name”: “{instance_id}”,
    “filters”: [
       {
        “type”: “include”,
        “expression”: “systemd:”
      },
      {
        “type”: “exclude”,
        “expression”: “Message Of The Day”
      }
    ]
  },
 .....
]

The following are a few sample system logs from an Amazon EC2 instance. This log file contains a mix of log lines that include dhclient, systemd, ec2net, etc. logs.

Screenshot of an Amazon EC2 instance terminal, displaying several examples of system logs

Figure 3: Amazon EC2 instance System logs

The ‘syslogs’ log group is created as a result of the agent configuration. The log group has all of the log lines that only contain the string ‘systemd:‘ in them. The log lines that contain the string ’Message Of The Day‘ are excluded.

Screenshot of the CloudWatch Console, displaying the ingested system logs

Figure 4: syslogs Log Group

In case you want to publish all of the log lines from the system logs file (var/log/messages), but also want to filter the noise caused by ‘ec2net:’ & ‘dhclient:’ log lines, append the following configuration extract to your agent configuration file:

{
    "file_path": "/var/log/messages", 
    "log_group_name": "syslogs2", 
    "log_stream_name": "{instance_id}",
    "filters": [
       {
        "type": "exclude",
        "expression": "(ec2net:|dhclient:)"
      }
    ]
}

Similarly, if you have HTTP access logs in the instance, and if you only want to filter and send the log lines for requests that have resulted in 4xx and 5xx response codes to CloudWatch, then the following excerpt of the CloudWatch agent configuration will help you achieve that.

"collect_list": [ 
  {
    "file_path": "/var/log/httpd", 
    "log_group_name": "ApplicationAccessLogs", 
    "log_stream_name": "ErrorAccessLogs.log",
    "filters": [
      {
      "type": "include",
      "expression": "(\\\"\\s(4|5)\\d{2})"
      }
    ]
  },
  .....
    ]

These are sample HTTP access logs from an Amazon EC2 instance. The access logs contain requests that have various HTTP response codes.

Screenshot of an Amazon EC2 instance terminal, displaying several examples of HTTP access logs

Figure 5: Amazon EC2 instance Access logs.

The CloudWatch log group will only receive log lines that have 4xx or 5xx response codes.

Screenshot of the CloudWatch Console, displaying the ingested HTTP access logs

Figure 6: ApplicationAccessLogs Log Group.

Conclusion

This brief introduction to CloudWatch Agent’s configurable log filter expressions provides a starting point for more advanced and customized configurations. A complete description of the feature is available in the CloudWatch Users Guide.

Utilizing this feature, you can streamline your system and application logs published to CloudWatch, thereby enhancing your ability to effectively monitor the health and performance of your servers and applications running on them.

Reference documentation

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent-New-Instances-CloudFormation.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/troubleshooting-CloudWatch-Agent.html

Abouth the authors:

Puneeth Ranjan Komaragiri

Puneeth Komaragiri is a Senior Technical Account Manager at AWS and started his journey as a Cloud Support Engineer in the Networking team where he worked on various AWS Networking & Monitoring services including VPC, ELB, CloudWatch, Route53, etc.. Prior to joining AWS, he was working as a Network Engineer at a service provider. Puneeth holds a Master’s degree in Telecommunications, Bachelor’s degree in Electrical Engineering and also holds several certifications including AWS Solutions Architect Professional, AWS Advanced Networking Specialty, AWS Database Specialty, AWS Solutions Architect Associate, AWS Developer Associate and AWS Sysops Administrator Associate.

Andrew Huynh

Andrew Huynh is a Software Development Engineer at AWS that is focused on raising the bar for the CloudWatch agent experience, and is a maintainer of the CloudWatch agent GitHub repo. Andrew’s favorite part of his day-to-day as an engineer is performing code reviews for his peers. Prior to joining AWS, Andrew worked as a software engineer for different financial services. Outside of work, you’ll find Andrew cooking, playing/recording/writing music, or at the dog park with his corgi.