Networking & Content Delivery

Secure and accelerate Drupal CMS with Amazon CloudFront, AWS WAF, and Edge Functions

In this post, you’ll learn how to secure and accelerate the delivery of Drupal-based websites using Amazon CloudFront, AWS Web Application Firewall (AWS WAF), and Amazon CloudFront Functions.

CloudFront is a content delivery network service (CDN) offering improved security and acceleration of the content served through it. This is true for static cacheable content and dynamic content because of optimizations done at different stages of the request/response cycle. Additionally, AWS WAF is a managed web application firewall service that fully integrates with CloudFront, Application Load Balancer (ALB), Amazon API Gateway, and AWS AppSync. Furthermore, it provides mechanisms to protect your application from Layer 7 attacks (the application layer of the OSI model).

A content management system (CMS) helps content owners create and maintain articles and sections on a website without having specialized knowledge of how to present it in a browser. As part of the CMS, there’s an admin section where content updates are made, and a publicly available website for viewers to consume the content.

When using Drupal, a popular CMS, you can set up separate domain names to access the section where you make content updates (we’ll call it the “admin domain name”) and for the public website. Or, you can use the same domain name for both. In this post, you’ll learn some strategies to improve security while delivering content using CloudFront and AWS WAF in these two scenarios.

Scenario 1: Separate domain names for admin users and viewers of content

In this setup, the admin and public website have different domain names. Here, you want to restrict anonymous viewers from accessing URL patterns related to content update operations, for example: /node, /admin, /core, /batch, etc.

To do this, create two independent CloudFront and WAF configurations with respective firewall rules to secure the Drupal backend.

Separate domain names for admin users and viewers of content

Figure1: Separate domain names for admin users and viewers of content

For viewers, you’ll define custom rules in AWS WAF to block all of the path patterns pertaining to the admin sections of the CMS. CloudFront will be configured to cache and serve content securely over HTTPS using the closest edge location, leading to faster page downloads. As traffic scales up, CloudFront serves more requests from cached content at the edge locations. This lowers origin requests and optimizes your backend infrastructure.

For admin users, you’ll define custom rules in your second AWS WAF to only allow requests based on a special session cookie set by Drupal for authorized users. In the second CloudFront distribution used for admin access, caching is disabled and CloudFront works to accelerate the dynamic content update API calls using techniques like reuse of persistent connections across requests, optimized TCP congestion windows, and TLS session resumption.

In both cases, CloudFront applies last mile optimizations, including content compression and TCP Bottleneck Bandwidth and Round-trip propagation time (BBR) to further lower the page load times for viewers.

If your origin is deployed within AWS, then traffic from the edge location to your AWS origin stays within the AWS network, leading to a more consistent content delivery experience for your end users.

This is the recommended setup, as it allows for maintaining fine-grained access and cache controls over the different set of users.

Scenario 2: Same domain name for admin users and viewers of content

In this approach, you have the same domain name for managing content updates as serving your viewers. To secure and accelerate your Drupal backend, you’ll setup a single CloudFront and AWS WAF configuration.

Same domain name for admin users and viewers of content

Figure2: Same domain name for admin users and viewers of content

For admin users to be able to manage content updates, you’ll define custom AWS WAF rules to check whether authorized users (indicated by a special session cookie set by Drupal) are connecting to restricted URL path patterns like /node, /admin, etc. Therefore, anonymous viewers trying to access those links are blocked. The AWS WAF rule must be applied to all admin URL patterns except for /user/login, which is used for login purposes and is allowed for all users.

In the CloudFront distribution, for all admin URL path patterns, create separate Cache Behaviors definitions configured to proxy content update API requests along with support for HTTP POST method. The default cache behavior is set to cache and serve website content pages (article and section pages) using the HTTP GET method.

While admin users edit content, make sure that the most recent updates are immediately visible without having to invalidate caches or wait for the page time-to-live (TTL) to expire. To do this, we use a simple CloudFront Function to bypass the cache for logged-in admin users.

Using a CloudFront function to bypass the cache

It’s relatively easy to identify a pre-defined set of admin URL path patterns and bypass caches for them. However, content URLs (article and sections pages) are dynamically generated, and admin users must review updates to them immediately.

To do this, a CloudFront Function checks for the existence of the special session cookie set by Drupal for logged-in users. If it’s present, it sets a random string value to a whitelisted query parameter. This query parameter is unique, generated during deployment, and included in the CloudFront function and cache policy definitions. This helps us bypass the cache for all of the pages for authenticated users. Simultaneously, it lets us serve content from cache for anonymous viewers, for whom this query parameter isn’t set. The function is triggered for all ‘‘Viewer Requests’“.

CloudFront Function code

var cookieName = "SESS";
var randomString = "SESS-randomstr";

function handler(event) {
  var request = event.request;
  if(loggedInUser(request)){
    request.querystring[randomString] = {value: "ran-"+Math.random()};
  }
  else{
    console.log("not logged in user")
  }
  return request;
}

function loggedInUser(request) {
    return Object.keys(request.cookies).some(key => key.startsWith(cookieName));
}

This function defines two variables that store the Drupal session cookie name (passed during deployment) and the generated query parameter (this is based on the cookie name and AWS Stack ID). This query parameter is whitelisted in the cache policy as well, so it’s part of the cache key definition in CloudFront.

The function checks if the request is coming in from logged-in users by looking for the cookie name. If present, it sets a random value to the whitelisted query parameter. This makes sure that all requests are proxied back to the origin for admin users, who can review content updates instantaneously. For anonymous viewers, a cached version is served.

In this scenario, there’s additional complexity and costs for serving the traffic.

Now that we’ve covered two possible configurations, in the remainder of this post we’ll walk through the setup of a reference solution to try it.

Deployment

Before getting started, you’ll need a Drupal-powered backend listening on a publicly available endpoint. If you don’t have a setup, then you can use the following getting started guide to set it up.

Once you have the Drupal backend available, clone the GitHub repo here and deploy the AWS CloudFormation template available under ‘templates/drupal-cf.yml’. Note that the deployment needs to be in the ‘us-east-1’ AWS Region.

CloudFormation template input parameters

The CloudFormation template defines the AWS resources and provides parameters to configure your setup matching either of the scenarios discussed earlier. When you start the deployment, enter a ‘Stack Name’, and then you’ll be presented with a set of configuration parameters. Let’s understand what they imply.

CloudFormation template input parameters

Figure3: CloudFormation template input parameters

Drupal Backend Configuration

  • Drupal backend Endpoint – this is the publicly available domain name pointing to your current Drupal installation. Note that this domain name must be different from the one serving your viewers.
  • Backend listening on https or http? – select ‘https’ if your origin is listening for TLS connections.
  • Drupal Cookie Name – we’ll leave this to the default value of ‘SESS’ and update it post deployment. Refer to the section ‘Note on Drupal Session Cookie’ for more details.

Drupal Frontend Configuration

  • DomainName (optional) – specify the domain name used by viewers to connect to your website. This domain name is mapped to the CloudFront distribution to serve the viewer traffic. If this is left blank, you can still serve the traffic with the default CloudFront domain name created, and the format will be ‘dxxxxx.cloudfront.net’.
  • AdminConfig – select ‘yes’ if you want separate configurations (scenario 1) created to cater to viewers and admin users. If you select ‘no’, then a single CloudFront distribution and AWS WAF configuration (scenario 2) is created.
  • AdminDomainName (optional) – specify the second domain name used by admin users to manage content updates to the Drupal CMS. This domain name will be mapped to a second CloudFront distribution if created (depends on value of AdminConfig parameter). If this is left blank, then you can still serve with default CloudFront domain name.

Note that when one or both of ‘DomainName’ and ‘AdminDomainName’ are specified as the alternate domain name to your CloudFront distribution, you must attach a trusted TLS certificate that validates your authorization to use the domain name. This deployment creates the necessary certificates using AWS Certificate Manager for these domains and uses DNS-based validation to authorize ownership of the domain.

Amazon Route 53 DNS Configuration

  • HostedZoneId – is used to create the ‘Domain Name’ and ‘Admin Domain Name’ records in your Amazon Route 53 hosted zone. This field is required if either or both domain names are specified.

Once you specify these parameters, deploy the CloudFormation template. After the deployment is complete, the AWS resources created depend on the input parameters specified during deployment.

If ‘Enable Admin configuration’ is set to ‘yes’ (caters to scenario 1), you’ll have two CloudFront distributions, and each of these distributions will be associated with its own AWS WAF configuration. The corresponding distribution’s ‘description’ field specifies if it’s configured for viewers or for admin users.

The CloudFront distribution with the description ‘StackName-viewer distribution for Drupal content delivery’ is used to serve traffic to your end viewers. This distribution has a single default behavior pointing to your ‘Drupal backend Endpoint’ as shown in the following.

CloudFront Cache Behavior

Figure 4: CloudFront Cache Behavior

This cache behavior is designed to optimize for caching content (using the ‘Managed-CachingOptimized’ cache policy’). Depending on the ‘Cache-Control’ headers sent from the origin, the content is cached with a minimum time to live (TTL) of 1 sec, a maximum TTL of 31536000 sec (365 days), and a default TTL of 86400 sec (1 day) when no cache-control header is specified. To learn more about how CloudFront honors cache-control headers, refer to this documentation.

The associated AWS WAF configuration with the name ‘StackName-ViewerWAFWebACL’ blocks all ‘admin’ access URLs using a regex match rule. A regular expression rule set with the name ‘StackName-DrupalAdminRegex’ is created and defines the path patterns for ‘admin’ URLs as shown. Update this regex rule group to include additional URL patterns that must be restricted.

^(\/user\/|\/admin\/)
^(\/node\/|\/batch|\/core\/)

A second CloudFront distribution with the description ‘StackName-Admin distribution for Drupal content updates’ is used by admin users to manage content updates. This distribution has a single default behavior with an origin that maps to the ‘Drupal backend Endpoint’ and is set to not cache content (using the ‘Managed-CacheDisabled’ cache policy).

CloudFront Cache Behavior

Figure 5: CloudFront Cache Behavior

This configuration passes all viewer headers, query strings, and cookies (using the ‘Managed-AllViewer’ origin request policy) so that content updates are reflected immediately to admin users.

The associated AWS WAF configuration name ‘StackName-AdminWAFWebACL’ blocks all anonymous access (when the Drupal session cookie isn’t set)  to admin URLs except the login page.

Additionally, if you’ve specified domain names using the ‘Domain Name’ and ‘Admin Domain Name’ field, then they’re associated to the corresponding CloudFront distributions along with the ACM certificates. Route 53 record sets are also placed, allowing you to serve traffic over HTTPS using the alternate domain names.

If ‘Enable Admin configuration’ is set to ‘no’ (caters to scenario 2), then the CloudFormation template deploys a single CloudFront distribution and AWS WAF configuration. In this configuration, the CloudFront distribution has explicit cache behavior to identify admin URL patterns that have caching disabled, and the default cache behavior caters to all article pages. The default behavior is also associated with a ‘Viewer-Request’ CloudFront function which, randomized a query parameter for logged-in users to make sure that updates to content are immediately visible, as explained earlier in the section ‘Using a CloudFront function to bypass the cache’.

Note on Drupal Session Cookie

In the initial deployment, we left the ‘Drupal Cookie Name’ input parameter to its default value of ‘SESS’, which is the default prefix Drupal uses for session cookies. We left it unchanged during the deployment because Drupal sets a unique cookie name for each host domain that it serves, and the actual name is available once we start serving through CloudFront using a custom domain name.

Therefore, once the deployment is complete, we access the admin sections of the CMS and use the developer tools of the browser to identify the actual cookie name that starts with ‘SESS’. We’ll use this value to redeploy and update the same CloudFormation stack, this time just updating the ‘Drupal Cookie Name’ field with the new value. This will update the AWS WAF configurations, cache policy and CloudFront function code to use the new values.

Conclusion

To summarize, you learned a few configurations to secure and accelerate Drupal-based CMS platforms using CloudFront, AWS WAF, and CloudFront functions. Depending on your deployment scenario, we hope that this post helps.

Jaiganesh Girinathan

Jaiganesh Girinathan

Jaiganesh Girinathan is a Senior Edge Specialist Solutions Architect focused on content delivery networks and edge computing capabilities with AWS. He has worked with several media customers globally over the last two decades, helping organizations modernize & scale their platforms. He is passionate about building solutions to address key customer needs. Outside of work, you can usually find Jaiganesh star gazing!