Networking & Content Delivery
Using AWS WAF intelligent threat mitigations with cross-origin API access
AWS WAF offers advanced features for filtering undesired web application traffic, such as Bot Control and Fraud Control. These intelligent threat mitigations include techniques such as client-side interrogations using JavaScript challenges or CAPTCHA, as well as client-side behavioral analysis.
Implementing these techniques on a web page with a same-origin access is simple. When a cross-domain access is needed, for example when APIs are exposed on a different domain than the webpage, configuring AWS WAF’s intelligent threat mitigations require additional steps. In this post, you learn about these additional steps, using a Single Page Application (SPA) example.
First, you deploy an SPA with a same-origin access that displays the country and IP information of the viewer using an API. The SPA is built using AWS serverless components (such as Amazon CloudFront, Amazon Simple Storage Service (Amazon S3), Amazon API Gateway, and AWS Lambda), and the API is protected by AWS WAF Bot Control. When a suspicious signal is detected by Bot Control on the client-side, it forces the viewer to solve a CAPTCHA before forwarding the API call to API Gateway.
Next, you change the SPA to host the API on a different domain. You learn about the changes required to make Bot Control protect the API with a cross-origin access.
A primer on the web concepts used
A Single Page Application (SPA) is a web application that operates within a single web page. In an SPA, the initial HTML, CSS, and JavaScript files are loaded once during the initial page load. Subsequent interactions and data retrieval are handled through asynchronous requests to the server, commonly using APIs such as AJAX or Fetch. Then, the retrieved data is then dynamically rendered and inserted into the existing page structure without navigating to a new URL.
A same-origin request is an HTTP request made from the web page to an API hosted on the same domain as the web page itself. A cross-origin request is made from a web page on one domain to an API hosted on a different domain. The same-origin policy in browsers restricts JavaScript requests to the same domain, protocol, and port. CORS techniques, such as adding the Access-Control-Allow-Origin header for simple requests to specify allowed domains, is required by browsers to enable cross-origin communication. If the request is not considered simple according to CORS, such as when it has a non-common header or POSTing JSON data, then a preflight request using HTTP OPTIONS verb must be sent in advance to the API to check that it permits the actual request.
Scenario 1: SPA example with same-origin access
AWS Cloud Development Kit (AWS CDK) is an open-source software development framework used to define cloud infrastructure in code and provision it through AWS CloudFormation. Follow these steps in your command line to deploy the SPA with AWS CDK, using the account information configured in your AWS CLI. Set the configured AWS Region in the CLI to us-east-1.
git clone https://github.com/aws-samples/aws-waf-bot-control-api-protection-with-captcha.git
cd aws-waf-bot-control-api-protection-with-captcha
npm install
npm run build
cdk bootstrap
cdk deploy
When the AWS CDK deployment is completed within minutes, the following resources are created:
The entry point of the SPA is a CloudFront distribution, with a domain name (YOURDISTRIBUTION.cloudfront.net) printed on the AWS CDK output for testing the SPA. The distribution has two cache behaviors pointing to two origins:
- The first cache behavior with caching disabled, CloudFront proxies all requests with the /api/* path pattern to the regional endpoint of an API Gateway. This triggers a Lambda function for every API call. The Lambda reads the viewer location and IP address sent by CloudFront using special headers, and sends them back as a JSON response.
- The second behavior with caching enabled is the default. This routes the rest of requests to an S3 bucket where the index.html file is stored. It’s uploaded with 5 seconds cache TTL to force CloudFront to regularly check the latest version of the HTML file.
An AWS WAF WebACL is associated to the CloudFront distribution. The WebACL includes the Bot Control managed rule group configured with Targeted inspection level to benefit from the intelligent threat mitigations features, with the Bot Control SDK included in the SPA HTML. To optimize the costs of this rule group, it’s scoped down to /api/* API requests. Note that in this sample WebACL, only rules related to intelligent threat mitigations are configured. For a production scenario, we recommended adding other rules, such as rate limiting and IP reputation to protect your API from different threats.
With legitimate viewers, the request flow to the SPA is the following:
- First, you navigate to the page, which downloads the index.html file from the S3 bucket.
- When the page loads, the Bot Control SDK downloads a JavaScript challenge (i.e., proof of work) to verify the browser, collect suspicious attributes (e.g., if using browser automation framework), and then acquire a session token and place it as a cookie on the SPA domain to track the session behavior.
- Then you click on the button “Lookup IP”, which sends an HTTP request to the API using AwsWafIntegration Fetch wrapper library of the Bot Control SDK. For additional information, please refer to “How to use integration fetch wrapper”. The HTTP request is sent with the cookie containing the session token. The library makes sure that the token was acquired before sending the HTTP request, otherwise it can be blocked by the Block-Requests-With-Missing-Or-Rejected-Token-Label configured rule in the WebACL.
- When the API request is received by AWS WAF, it triggers the Bot Control rule, which either passes the request to the API Gateway, or responds with a 405 HTTP error code to trigger client-side CAPTCHA verification if a suspicious signal is detected.
- If a CAPTCHA verification is needed, then the HTML code detects the 405 response code and renders a CAPTCHA on the page using the Bot Control SDK. For additional information, please refer to “How to render CAPTCHA puzzle”. This is done using the CAPTCHA API, which requires an API Key allowing the SPA domain name. The API Key is automatically created and inserted in the SPA HTML during the AWS CDK deployment.
AwsWafIntegration.fetch(apiURL).then(response => {
if (response.status == 405) {
showMyCaptcha();
}
else {
response.json().then(myJson => {
renderResponse(myJson) });
}
});
Scenario 2: SPA example with cross-origin access
Update the architecture to a cross-origin API access scenario by running the following command:
cdk deploy -c CROSS_DOMAIN_ENABLED=true
The architecture is updated within minutes as described in the following diagram:
In this architecture, the API is served on a separate CloudFront distribution, with a different domain name that you can find in the AWS CDK output. If you access this API directly in your browser, then you are blocked by AWS WAF, thanks to the Block-Requests-With-Missing-Or-Rejected-Token-Label configured rule.
You can conduct the same tests of the previous section to verify that the API is protected using Bot Control in a scenario of cross-origin API access. To make it work, the configuration of AWS WAF WebACL, CloudFront and API Gateway were modified by AWS CDK.
First, the configured Token Domain List of the WebACL is modified to include the domain name of the SPA. By default, AWS WAF only accepts tokens with a domain setting that exactly matches the host domain of the resource that’s associated with the web ACL. This was the case in the same-origin API access scenario. In the new architecture, the WebACL is associated to the API CloudFront distribution Therefore, you must explicitly allow the domain name of SPA CloudFront distribution in the Token Domain List of the WebACL. For additional information, please refer to “Token domains and domain lists”. Note that the CAPTCHA API key is not changed because it was generated with the initial SPA domain name.
Second, the AwsWafIntegration library detects the cross-origin fetch and automatically sends the token in the special header X-Aws-Waf-Token. This is because cookies cannot be sent across domains. This token is now is carried on a non-common HTTP header, so the API is configured with appropriate CORS that enables preflight requests using HTTP OPTIONS. The changes are done in multiple places of the API:
- The API CloudFront distribution is updated to allow the HTTP OPTIONS method
- The API CloudFront distribution is configured with a Header Response Policy to add the following CORS headers:
- Access-Control-Allow-Origin: <the domain name of the SPA>
- Access-Control-Allow-Methods: GET, OPTIONS
- Access-Control-Allow-Headers: X-Aws-Waf-Token
- Access-Control-Max-Age: 600
- Access-Control-Allow-Credentials: false
- The API Gateway is configured to respond to the CORS preflight OPTIONS
- The AWS WAF WebACL is configured to exclude the CORS preflight OPTIONS from being inspected by the Bot Control rule group and the Block-Requests-With-Missing-Or-Rejected-Token-Label rule.
To test the cross-origin access scenario, you access the URL using the “SPAURL” from the AWS CDK output that you got after updating the architecture. Similar to scenario 1, to test the CAPTCHA workflow, you should change the user agent of your browser to a random value that is selected as a suspicious signal by the Bot Control. To understand the request flow, you open developer tools in your browser, go to the Network tab, and access the application.
Cleaning up resources
To remove the AWS CDK created resources, run the following command:
cdk destroy
Conclusion
In this post, you learned how to implement the AWS WAF Bot Control rule group together with CAPTCHA API in a single-page application (SPA) to protect an API in both same-origin and cross-origin access scenarios.
To learn more about intelligent threat mitigations using AWS WAF, visit this documentation page.