AWS Security Blog
Federated access to Amazon Athena using AWS IAM Identity Center
Managing Amazon Athena through identity federation allows you to manage authentication and authorization procedures centrally. Athena is a serverless, interactive analytics service that provides a simplified and flexible way to analyze petabytes of data.
In this blog post, we show you how you can use the Athena JDBC driver (which includes a browser Security Assertion Markup Language (SAML) plugin) to connect to Athena from third-party SQL client tools, which helps you quickly implement identity federation capabilities and multi-factor authentication (MFA). This enables automation and enforcement of data access policies across your organization.
You can use AWS IAM Identity Center to federate access to users to AWS accounts. IAM Identity Center integrates with AWS Organizations to manage access to the AWS accounts under your organization. In this post, you will learn how to configure the Athena driver to use the AWS configuration profile credentials. This will allow you to resolve credentials from IAM Identity Center and use the MFA capability of your federation identity provider (IdP).In this post, you will learn how you can integrate the Athena browser-based SAML plugin to add single sign-on (SSO) and MFA capability with your federation identity provider (IdP).
Prerequisites
To implement this solution, you must have the follow prerequisites:
- An AWS account.
- Install or update to the latest version of the AWS Command Line Interface (AWS CLI).
- Configure IAM Identity Center authentication using the AWS CLI.
- IAM Identity Center enabled. See Enabling AWS IAM Identity Center.
- Access to SQL client tools (such as SQL Workbench/J, Pycharm, and so on) that support JDBC connections.
- An Amazon Simple Storage Service (Amazon S3) bucket to store Athena query results.
- Knowledge of using AWS Lake Formation and enabling Lake Formation to manage permissions to a set of tables.
- A Lake Formation administrator role. See Lake Formation personas and IAM permissions reference for information on creating a data lake administrator.
- Tables and databases are populated in your AWS Glue Data Catalog.
- Create two Athena workgroups (for example, sensitive and non-sensitive).
Note: Lake Formation only supports a single role in the SAML assertion. Multiple roles cannot be used.
Solution overview
To implement the solution, complete the steps below as shown in Figure 1:
- An IAM Identity Center delegated administrator creates two custom permission sets within Identity Center.
- An IAM Identity Center delegated administrator assign permission sets to AWS accounts and users and groups. The user has permissions to single sign-on roles that are provisioned in the data lake account. The role created by Identity Center has a name that begins with
AWSReservedSSO
. - A Lake Formation administrator grants single sign-on roles permissions to the corresponding database and tables.
The solution workflow consists of the following high-level steps as shown in Figure 1:
- The user configures IAM Identity Center authentication using the AWS CLI.
- The AWS CLI redirects the user to the AWS access portal URL. The user enters workforce identity credentials (username and password). Then chooses Sign in.
- The AWS access portal verifies the user’s identity. IAM Identity Center redirects the request to the Identity Center authentication service to validate the user’s credentials.
- If MFA is enabled for the user, then they are prompted to authenticate their MFA device.
- The user enters or approves the MFA details. The user’s MFA is successfully completed.
- The user selects the AWS account to use from the displayed list. Then select the IAM single sign-on role to use from the displayed list.
- The user tests the SQL client connection and then uses the client to run a SQL query.
- The client makes a call to Athena to retrieve the table and associated metadata from the Data Catalog.
- Athena requests access to the data from Lake Formation. Lake Formation invokes the AWS Security Token Service (AWS STS).
- Lake Formation invokes AWS STS.
- Lake Formation obtains temporary AWS credentials with the permissions of the defined IAM role (sensitive or non-sensitive) associated with the data lake location.
- Lake Formation returns temporary credentials to Athena.
- Athena uses the temporary credentials to retrieve data objects from Amazon S3.
- The Athena engine successfully runs the query and returns the results to the client.
Solution walkthrough
The walkthrough includes five sections that will guide you through the process of creating permission sets, assigning permission sets to AWS Accounts, managing permission sets access using Lake Formation, and setting up third-party SQL clients such as SQL Workbench to connect to your data store and query your data through Athena.
Step 1: Federate onboarding
Federating onboarding is done within the IAM Identity Center account. As part of federated onboarding, you need to create IAM Identity Center users and groups. Groups are a collection of people who have the same security rights and permissions. You can create groups and add users to the groups. Create one IAM Identity Center group for sensitive data and another for non-sensitive data to provide distinct access to different classes of data sets. You can assign access to IAM Identity Center permission sets to a user or group.
To federate onboarding:
- Open the AWS Management Console using the IAM Identity Center account and go to IAM Identity Center.
- Choose Groups.
- Choose Create group.
- Enter a Group name and Description .
- Choose Create group.
To add a user as a member of a group:
- Open the IAM Identity Center console.
- Choose Groups.
- Select the group name that you want to update.
- On the group details page, under Users in this group, choose Add users to group.
- On the Add users to group page, under Other users, locate the users you want to add as members and select the check box next to each of them.
- Choose Add users to group.
Step 2: Create permission sets
For this step, create two permission sets (sensitive-iam-role
and non-sensitive-iam-role
). These permission sets can be assigned to users or groups in IAM Identity Center, granting them specific access to AWS account resources.
To create custom permission sets:
- In the IAM Identity Center administrator account, under Multi-Account permissions, choose Permission sets.
- Choose Create permission set.
- On the Select permission set type page, under Permission set type, choose Custom permission set.
- Choose Next.
- On the Specify policies and permission boundary page, expand Inline policy to add custom JSON-formatted policy text.
- Insert the following policy and update the S3 bucket name (
<s3-bucket-name>
), AWS Region (<region>
) account ID (<account-id>
), CloudWatch alarm name (<AlarmName>
), Athena workgroup name (sensitive or non-sensitive) (<WorkGroupName>
), KMS key alias name (<KMS-key-alias-name>
), and organization ID (<aws-PrincipalOrgID>
). - Update the custom policy to add the corresponding Athena workgroup ARN for the sensitive and non-sensitive IAM roles.
Note: See the documentation for information about AWS global condition context keys.
- Choose Next.
- On the Specify permission set details page, enter a name to identify this permission set in IAM Identity Center. The name that you specify for this permission set appears in the AWS access portal as an available role. Users sign in to the AWS access portal, choose an AWS account, and then choose the role.
- Choose Next.
- On the Review and create page, review the selections that you made, and then choose Create.
Step 3: Assign permission sets to AWS accounts
You can add and remove permissions sets for an IAM user or group by attaching and detaching permission sets. Permission sets define what actions an identity can perform on which AWS resources.
To assign permission sets to AWS accounts:
- In the IAM Identity Center administrator account, under Multi-account permissions, choose AWS accounts.
- On the AWS accounts page, select one or more AWS accounts that you want to assign single sign-on access to.
- Choose Assign users or groups.
- On the Assign users and groups to “<AWS account name>”, for Selected users and groups, choose the users that you want to create the permission set for. Choose Next.
- Select permission sets: On the Assign permission sets to “AWS-account-name” page, select one or more permission sets.
- On the Review and submit assignments to AWS-account-name page, for Review and submit, choose Submit.
Step 4. Grant permissions to IAM (single sign-on) roles
A data lake administrator has the broad ability to grant a principal (including themselves) permissions on Data Catalog resources. This includes the ability to manage access controls and permissions for the data lake. When you grant Lake Formation permissions on a specific Data Catalog table, you can also include data filtering specifications. This allows you to further restrict access to certain data within the table, limiting what users can see in their query results based on those filtering rules.
To grant permissions to IAM roles:
In the Lake Formation console, under Permissions in the navigation pane, select Data Lake permissions, and then choose Grant.
To grant Database permissions to IAM roles:
- Under Principals, select the IAM role name (for example, Sensitive-IAM-Role).
- Under Named Data Catalog resources, go to Databases and select a database (for example, demo).
- Under Database permissions, select Describe and then choose Grant.
To grant tables permissions to IAM roles:
- Repeat steps 1 and 2 of the preceding procedure.
- Under Tables – optional, select a table name (for example, demo2).
- Select the desired Table Permissions (for example, select and describe), and then choose Grant.
- Repeat steps 1 through 4 to grant access for the respective database and tables for the non-sensitive IAM role.
Step 5: Client-side setup using JDBC
You can use a JDBC connection to connect Athena and SQL client applications (for example, PyCharm or SQL Workbench) to enable analytics and reporting on the data that Athena returns from Amazon S3 databases. To use the Athena JDBC driver, you must specify the driver class from the JAR file. Additionally, you must pass in some parameters to change the authentication mechanism so the athena-sts-auth libraries are used:
- S3 output location – Where in S3 the Athena service can write its output. For example,
s3://path/to/query/bucket/
. - The IAM Identity Center administrator can configure the session duration for the AWS access portal. The session duration can be set from a minimum of 15 minutes to a maximum of 90 days.
To set up PyCharm
- Install Athena JDBC 3.x driver from Athena JDBC 3.x driver.
- In the left navigation pane, select JDBC 3.x and then Getting started. Select Uber jar to download a .jar file, which contains the driver and its dependencies.
- Open PyCharm and create a new project.
- Enter a Name for your project
- Select the desired project Location
- Choose Create
- Configure Data Source and drivers. Select Data Source, and then choose the plus sign or New to configure new data sources and drivers.
- Configure the Athena driver by selecting the Drivers tab, and then choose the plus sign to add a new driver.
- Under Driver Files, upload the custom JAR file that you downloaded in the Step 1. Select the Athena class dropdown. Enter the driver’s name (for example
Athena JDBC Driver
). Then choose Apply. - Configure a new data source. Choose the plus sign and select your driver’s name from the driver dropdown.
- Enter the data source name (for example,
Athena Demo
). For the authentication method, select User & Password. Then choose Apply. - Select the SSH/SSL tab and select Use SSL. Verify that the Use truststore options for IDE, JAVA, and system are all selected. Then choose Apply.
- Select the Options tab and then select Single Session Mode. Then choose Apply.
- Select the General tab and enter the JDBC and single sign-on URL. The following is a sample JDBC URL based on the SAML application:
- Choose Apply.
- Choose Test Connection. If the profile has expired, refresh the single sign-on session by running
aws sso login --profile <profile-name>
with the corresponding profile.
- After the connection is successful, select the Schemas tab and select All databases and All schemas.
- Run a sample test query:
SELECT <table-names> FROM <database-name> limit 10;
- Verify that the credentials and permissions are working as expected.
To set up SQL Workbench
- Open SQL Workbench.
- Configure an Athena driver by selecting File and then Manage Drivers.
- Enter the
Athena JDBC Driver
as the name and set the library to browse the path for the location where you downloaded the driver. Enteramazonaws.athena.jdbc.AthenaDriver
as the Classname. - Enter the following URL, replacing
<name-of-the-WorkGroup>
with your workgroup name. - Choose OK.
- Run a test query, replacing
<table-names>
and<database-name>
with your table and database names: - Verify that the credentials and permissions are working as expected.
Conclusion
In this post, we covered how to use JDBC drivers to connect to Athena from third-party SQL client tools. You were able to set this up without creating IAM users or any type of long-lived credentials that would need to be stored on your developers’ workstations. You learned how to configure IAM Identity Center users and groups, create permission sets, and assign permission sets to AWS Accounts. You also learned how to grant permissions to single sign-on roles using Lake Formation to create distinct access to different classes of data sets and connect to Athena through an SQL client tool (such as PyCharm). This setup can also work with other supported identity sources such as IAM Identity Center, self-managed or on-premises Active Directory, or an external IdP.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.