AWS Big Data Blog

Setting up trust between ADFS and AWS and using Active Directory credentials to connect to Amazon Athena with ODBC driver

Amazon Athena is a serverless and interactive query service that allows you to easily analyze your raw and processed datasets in Amazon Simple Storage Service (Amazon S3) using standard SQL. The JDBC and ODBC drivers that Athena provides allow you to easily integrate your data analytics tools (such as Microsoft Power BI, Tableau, or SQLWorkBench) with Athena seamlessly and gain insights about your data in minutes.

Before November 2018, you had to connect to Athena with ODBC or JDBC drivers using your IAM user or role credentials. However, with the November 20, 2018 release of support for Microsoft Active Directory Federation Services (ADFS 3.0) and Security Assertion Markup Language (SAML 2.0) in the Athena ODBC/JDBC driver, you can now connect to Athena directly using your Microsoft Active Directory (AD) credentials.

Microsoft ADFS 3.0, a component of Windows Server, supports SAML 2.0 and is integrated with AWS Identity and Authentication Management (IAM). This integration allows Active Directory (AD) users to federate to AWS using corporate directory credentials, such as username and password from Microsoft Active Directory.

This post walks you through configuring ADFS 3.0 on a Windows Server 2012 R2 Amazon Elastic Compute Cloud (Amazon EC2) instance and setting up trust between ADFS 3.0 IdP and AWS through SAML 2.0. The post then demonstrates how to install the Athena OBDC driver on Amazon Linux EC2 instance (RHEL instance) and configure it to use ADFS for authentication.

Solution overview

The following architecture diagram shows how an AD user in your organization is authenticated in the Athena ODBC/JDBC driver:

The process includes the following steps:

  1. A user in your organization uses a client application with the JDBC or ODBC driver to request authentication from your organization’s IdP. The IdP is ADFS 3.0.
  2. The IdP authenticates the user against AD, which is your organization’s Identity Store.
  3. The IdP constructs a SAML assertion with information about the user and sends the assertion to the client application via the JDBC or ODBC driver.
  4. The JDBC or ODBC driver calls the AWS Security Token Service AssumeRoleWithSAML API operation, passing it the following parameters:
    • The ARN of the SAML provider
    • The ARN of the role to assume
    • The SAML assertion from the IdP
  5. The API response to the client application via the JDBC or ODBC driver includes temporary security credentials.
  6. The client application uses the temporary security credentials to call Athena API operations, which allows your users to access Athena API operations.

This post walks you through configuring ADFS-AWS trust through SAML and using that trust to federate AD users in the Athena JDBC/ODBC driver.

To implement this solution, you complete the following steps:

  • Configure ADFS (3.0) on a Windows Server 2012 R2 Amazon EC2 instance
  • Set up trust between AWS and ADFS (3.0) through SAML 2.0 rules
  • Install the Athena ODBC driver 1.0.5 on RHEL EC2 instance and configure it to use ADFS

Prerequisites

For this walkthrough, you need to have the following prerequisites:

  • An understanding of the concepts of Active Directory. The steps for configuring Active Directory on a Windows instance are outside the scope of this post.
  • An understanding of IAM roles and concepts.
  • DNS and networking set up between your Active Directory server and the instance on which the Athena ODBC or JDBC driver is installed. This post sets up Active Directory (that runs ADFS) on a Windows Amazon EC2 instance and ODBC driver on another EC2 instance that are part of the same AWS VPC and subnet. However, for your use case, you need to provide that connectivity between the ADFS server and the OBDC/JDBC instance.

Configuring ADFS (3.0) on a Windows Server 2012 R2 Amazon EC2 instance

In the following steps, you install ADFS 3.0 on a Windows Server 2012 R2 Amazon EC2 instance. As per the prerequisites, you already installed Active Directory on a Windows Server 2012 R2 EC2 instance. For this post, the domain name is arunad.local. For instructions on setting up an Active Directory domain controller on an EC2 instance, see Building Your First Domain Controller on 2012 R2 on the Microsoft TechNet website.

Installing prerequisites for ADFS 3.0

To configure ADFS 3.0 on a Windows domain controller, you must have the following:

  • An SSL certificate – For this post, you can create a self-signed certificate by installing IIS (Internet Information Server)
  • Configuring a service account – Create an Active Directory user with Domain Admin groups
  1. Install IIS on Windows Server 2012 R2 Amazon EC2 instance. For instructions, see How to install and configure IIS on Windows Server 2012 R2 on The Solving website. For this post, you can skip Step 2 in the preceding instructions.
  2. After you install IIS, create a self-signed certificate. For instructions, see How to Create a Self Signed SSL Certificate with Windows Server on the Sophos Community website. For this post, you can skip the step about binding the self-signed certificate in the preceding instructions.

To configure a service account in your domain controller, you create a user in your active directory with the name ADFSSVC and add the user to the domain admins group.

  1. Open Server Manager.
  2. Choose Tools.
  3. Choose Active Directory Users and Computers.
  4. Expand your domain (arunad.local).
  5. Choose User (right-click).
  6. Choose New.
  7. Choose User.
  1. Create a user with the name ADFSSVC.
  2. Set the password to never expire.

You can now add user ADFSSVC to the domain admins group.

  1. Choose Users.
  2. Choose ADFSSV (right-click) and choose Add to group.
  3. In the search bar, enter domain.
  4. Choose checknames.
  5. Choose Domain Admins.
  6. Choose OK.

You receive a message that the user is added to the group, but should still verify it.

  1. Choose ADFSSVC (right-click) and choose Properties.
  2. On the Member Of tab, check that Domain Admins is listed.

Installing and configuring ADFS 3.0

Now that you have installed the prerequisites for ADFS 3.0, you can install and configure ADFS 3.0 on Windows Server 2012 R2 EC2 instance.

  1. Open Server Manager.
  2. Choose Roles and Features.
  3. Select Role-based or feature-based installation.

  1. Choose Next until you reach the Select server roles
  2. For Roles, select Active Directory Federation Services.

  1. Choose Next until you reach the Confirmation installation selections
  2. Choose Install.

  1. Choose Configure the Federation Service for this server.
  2. Select Create the first federation server in a federation server farm.
  3. Choose Next.

  1. Choose Next until you reach the Specify Service Properties
  2. For SSL Certificate, choose the self-signed certificate you installed earlier.
  3. For Federation Service Display Name, enter ArunADFS.
  4. Choose Next.

  1. In Specify Service Account page, select Use an existing domain user account or group Managed Service Account.
  2. Choose Select.

  1. In the text box, enter ADFSSVC.
  2. Choose Check names.
  3. When the name is populated, choose OK.

  1. Enter your password and choose Next.

  1. Select Create a database on this server using Windows Internal Database.
  2. Choose Next.

  1. Choose Next until you reach the Pre-requisite Checks
  2. Choose Configure.

When the server is successfully configured, you may see the following warning message: 

An error occurred during an attempt to set the SPN for the specified service account. Set the SPN for the service account manually. For more information about setting the SPN of the service account manually, see the AD FS Deployment Guide. Error message: The SPN required for this Federation Service is already set on another Active Directory account. Choose a different Federation Service name and try again.

To fix the problem, run the following command by opening PowerShell as an administrator:

setspn -a host/localhost adfssvc

The following code shows the output.

The ADFS 3.0 configuration is now complete.

  1. To download your ADFS server’s federation XML file, open a browser on your Windows Server and enter the following address: https://<yourservername>/FederationMetadata/2007-06/FederationMetadata.xml.

This file is required to set up trust between ADFS and AWS.

Alternatively, you can download the ADFS server’s federation XML file by running the following command as administrator in PowerShell 3.0+:

wget https://<your-server-name>/FederationMetadata/2007-06/FederationMetadata.xml -OutFile FederationMetadata.xml

You can see your server name by clicking Start button (Windows icon) > Right click My Computer > Click Properties > Check for Full Computer Name.

Copy the downloaded XML file on to your local machine so you can use it when creating a SAML-based role in IAM in the next step.

Establishing trust between Windows AD (using ADFS IDP) and AWS via SAML 2.0

Now that you have configured the ADFS 3.0 on a Windows Server, you can establish the trust between AWS and the IdP (ADFS) via SAML assertion claim rules. By establishing this trust, users in your AD can federate into AWS using an IAM role and access AWS resources such as Athena or the AWS Glue Data Catalog.

Setting up this trust requires configuration in both AWS and Active Directory. For AWS, you set up IAM roles and establish a SAML provider. For Active Directory, you write the SAML assertion and claim rules.

Setting up your SAML provider in IAM

To set up your SAML provider, complete the following steps:

  1. On the IAM console, choose Identity provider.
  2. Choose Create provider.
  3. For Provider Type, choose SAML.
  4. For Provider Name, enter MytestADFS.
  5. For Metadata Document, choose the XML file you downloaded earlier.

  1. Create a new role in IAM and choose the trusted entity as SAML 2.0 federation.
  2. For SAML provider, choose the provider you created earlier (MytestADFS).
  3. For Attribute, select SAML:aud.

  1. Add the necessary IAM permissions to this role and create the IAM role.

For this post, attach the AthenaFullAccess managed policy and name it ArunADFSTest.

The role name you give in this step is crucial because any users and groups you create in Active Directory as part of establishing trust in the following steps are based on this name.

Configuring the SAML assertion rules in ADFS 3.0 IdP

In this step, you configure the SAML assertion rules in your ADFS so that users can federate into AWS using the IAM role you created.

Determining how you create and delineate your AD groups and IAM roles in AWS is crucial in how you secure access to your account and manage resources. One approach for creating the AD groups that uniquely identify the IAM role mapping is by selecting a common group naming convention.

For this post, create a group in Active Directory with the naming convention AWS-<AccountID>-<IAMRolename>; for example, AWS-123456789012-ArunADFSTest.

This naming convention is extremely crucial in the next steps because you write SAML assertion claim rules where you fetch all the AD groups of your AD user that contain the string AWS-<AccountID>- in them and extract the last portion of the group name to map it with IAM role in AWS. For example, if the AD user that you’re authenticating to AWS is part of AD group AWS-123456789012-ArunADFSTest, then the claim rules, which you write later, find all groups of the AD user that match with string AWS-123456789012-, extract the last portion of the group name (ArunADFSTest), and send it as the role name to the AWS SAML endpoint in the format aws:arn::iam::<AccountID>: role/ArunADFSTest.

  1. In Server Manager, under Tools, choose Active directory users and computers.
  2. Choose your domain (right-click) and choose New.
  3. Choose Group.

The following screenshot shows creating an AD group with name AWS-123456789012-ArunADFSTest:

  1. After you create the group in AD with name AWS-123456789012-ArunADFSTest, create a new user in that group. For this post, name the user myldapuser1.

  1. Make sure the E-mail field of the user is filled with a valid email address syntax because you use this email field of the user and pass it as RolesessionName to AWS when constructing the SAML token.

  1. After you create the user, add the user to the AD group AWS-123456789012-ArunADFSTest.

Now that you’ve created the AD groups, AD users, and IAM roles, you create the relying party trust in ADFS and write the claim rules. The ADFS IdP needs to construct the following values in the SAML assertion and send the values to AWS for authentication:

    • NameID
    • RoleSessionName
    • Roles (which contains your SAML IDP in AWS and role name)

For instructions on setting up the relying trust and claim rules in ADFS, see AWS Federated Authentication with Active Directory Federation Services (AD FS). For this walkthrough, you can start at the Active Directory Federation Services Configuration section.

  1. For Display name, enter My Amazon Portal.

After the configuration, your claim rules looks similar to the following screenshots.

The following screenshot shows the rules for NameID.

The following screenshot shows the rules for RoleSessionName.

The following screenshot shows the rules for Get AD Groups.

The following screenshot shows the rules for Roles. The SAML provider in IAM was created with the name MyTestADFS (arn arn:aws:iam::123456789012:saml-provider/MytestADFS), so you need to use that same value here (MyTestADFS) in the claim script. If you used a different name, replace it with your IdP ARN name.

After you create these four rules, your ADFS relying trust setup is complete.

Verifying your IdP

To verify that you set up your IdP successfully, complete the following steps:

  1. Navigate to the following URL in the browser on your ADFS server Windows instance (use your Windows Server hostname): https://<windows-hostname>/adfs/ls/IdpInitiatedSignOn.aspx.

  1. Select your ADFS display name (My Amazon Portal).

  1. Enter the AD credentials for the user myldapuser1 you created earlier.

Because the domain name for this demo is arunad.local, specify the user name as ARUNAD\myldapuser1, or you can specify it as myldapuser@arunad.local.

If everything is successful, you should be able to sign in to the AWS Management Console.

If you encounter any errors, review the preceding steps. For more information about common errors with SAML, see Troubleshooting SAML 2.0 Federation with AWS. For additional information about troubleshooting, see How to View a SAML Response in Your Browser for Troubleshooting.

The Athena ODBC/JDBC driver when initiating connection to your ADFS server also uses the same federation URL (https://<windows-hostname>/adfs/ls/IdpInitiatedSignOn.aspx) for retrieving the SAML assertion AWS properties.

Installing the Athena ODBC driver 1.0.5 on an Amazon EC2 Linux instance and configuring it to use ADFS

Now that you have configured ADFS 3.0 and set up trust with AWS, the last step is to configure the Athena ODBC driver to use ADFS as its authentication mechanism.

As per the prerequisites, you launched the Active Directory Windows EC2 instance and the Athena ODBC driver Linux (RHEL) EC2 instance in the same VPC and subnet and allowed all traffic between both instances. However, in your environment, you need to make sure your ADFS server can communicate with the machine that has Athena JDBC/ODBC installed.

For this post, the domain name servers in the DHCP option set of VPC (in which you launch the Linux EC2) are modified as follows: 172.31.X.X, AmazonProvidedDNS, where 172.31.X.X is the IP address of the EC2 Windows instance on which ADFS is configured with the domain name arunad.local.

Setting up the environment on your EC2 instance

To set up your environment, complete the following steps:

  1. Launch a Linux EC2 instance with an AMI that supports the Red Hat Enterprise Distribution (for example, AMI with RHEL-7.6_HVM_GA-20181017-x86_64-0-Hourly2-GP2).
  2. SSH into the instance and enter the following commands:
sudo yum install telnet
sudo yum install nc
sudo yum install gcc
sudo yum install wget
sudo yum install vim
  1. Verify connectivity between your ADFS server and RHEL EC2 instance with a networking tool of your choice.

This post uses the ping utility. The following code shows the output:

  1. Install the open-ldap client and test if you can search for a user in AD from this Linux instance (replace the values with your user and domain name):
sudo yum install openldap-clients
ldapsearch -h arunad.local -p 389 -D "CN=mylapuser1,CN=Users,DC=arunad,DC=local" -x -W -b "DC=arunad,DC=local"

If these commands are successful, the RHEL EC2 instance can communicate with the AD server and retrieve the credentials.

Installing and configuring UnixODBC Driver Manager 2.3.4

The Athena ODBC driver on Linux requires you to have one of the following ODBC Driver Managers installed to set up the connection:

  • iODBC 3.52.9, 3.52.10, 3.52.11, or 3.52.12
  • unixODBC 2.3.2,2.3.3, or 2.3.4
  1. Install the UnixODBC driver manager 2.3.4 on your RHEL EC2 instance. For instructions, see unixODBC-2.3.4 on the Beyond Linux From Scratch website.

The command needed for installation on your EC2 instance should look similar to the following code:

wget ftp://ftp.unixodbc.org/pub/unixODBC/unixODBC-2.3.4.tar.gz
tar -zxvf unixODBC-2.3.4.tar.gz
cd unixODBC-2.3.4
./configure --prefix=/usr --sysconfdir=/etc/unixODBC && make
  1. Install the ODBC Driver Manager by switching to the root user. See the following code:
sudo -su root

make install &&

find doc -name "Makefile*" -delete                &&
chmod 644 doc/{lst,ProgrammerManual/Tutorial}/*   &&

install -v -m755 -d /usr/share/doc/unixODBC-2.3.4 &&
cp      -v -R doc/* /usr/share/doc/unixODBC-2.3.4

  1. After the ODBC Driver Manager is installed, make sure the following files are present, which indicate a successful installation of the ODBC driver manager on an RHEL EC2 instance:
    • /etc/unixODBC/odbcinst.ini
    • /etc/unixODBC/odbc.ini
  1. The ODBC Driver Manager library files created as part of the installation are present in the path /usr/lib. Set the shared library path to point your ODBC Driver Manager libraries by entering the following command as a non-root user:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib

  1. Verify the ODBC Driver Manager environment configuration is loaded properly by entering the code odbcinst -j.

Installing and configuring the Simba Athena ODBC Driver Manager (64 bit)

Now that the ODBC driver manager is configured, the final step is to install the Athena ODBC driver 1.0.5 on this Linux Instance and configure it to use ADFS as the authentication mechanism.

  1. To install the Athena ODBC driver on this EC2 instance, enter the following code:
wget https://s3.amazonaws.com/athena-downloads/drivers/ODBC/SimbaAthenaODBC_1.0.5/Linux/simbaathena-1.0.5.1006-1.x86_64.rpm

sudo yum --nogpgcheck localinstall simbaathena-1.0.5.1006-1.x86_64.rpm

The example ODBC configuration file of the downloaded driver is included in the path /opt/simba/athenaodbc/Setup/odbc.ini.

  1. To configure the ODBC driver to use ADFS as an authentication mechanism, log in as ec2-user and enter the following code:

vim .odbc.ini

  1. Insert the following configuration directives:
[ODBC]
Trace=no

[ODBC Data Sources]
Simba Athena 64-bit=Simba Athena ODBC Driver 64-bit


[Simba Athena 64-bit]
Description=Simba Athena ODBC Driver (64-bit) DSN
Driver=/opt/simba/athenaodbc/lib/64/libathenaodbc_sb64.so


# Connection configurations should be set here.
AwsRegion=us-west-2
Schema=default
S3OutputLocation=s3://aws-athena-query-results-123456780912-us-west-2
AuthenticationType=ADFS
UID=ARUNAD\myldapuser1
PWD=XXXXXXXX
IdP_Host=win-qikm653mpj9.arunad.local
IdP_Port=443
SSL_Insecure=true

Replace the values for IdP_Host and IdP_Port to point to your ADFS server. For this post, these values are win-qikm653mpj9.arunad.local and 443. Similarly, replace UID and PWD with the LDAP user name and password you created earlier. This post uses ARUNAD\myldapuser1.  Also,  replace AwsRegion and S3OutputLocation values according to your environment. For production workloads, make sure that you set SSL_Insecure to false so the driver can verify the server certificate.

You can retrieve the HTTPS port number of your ADFS server by entering the following code on the Windows AD server instance’s power shell:

Get-AdfsProperties

  1. To enable DEBUG level logging on your Athena ODBC driver, edit the file /opt/simba/athenaodbc/lib/64/simba.athenaodbc.ini and set the following values:
[Driver]
## - Note that this default DriverManagerEncoding of UTF-32 is for iODBC.
ErrorMessagesPath=/opt/simba/athenaodbc/ErrorMessages
LogLevel=5
LogPath=/home/ec2-user/odbclogs/
SwapFilePath=/tmp
  1. Now that the Athena ODBC driver is configured, you can test it by entering the following code:

isql -v "Simba Athena 64-bit"

In the preceding code, Simba Athena 64-bit refers to the name of your DSN you specified in /home/ec2-user/.odbc.ini while connecting to the ODBC driver.

If you’re connected, it means you have successfully connected the Athena ODBC driver manager by authenticating your user against ADFS.

You can also check the connection log to verify the connection URI used by your driver and values returned by ADFS to the Athena ODBC driver.

Entries from connection log snippet looks as follows:

May 09 01:40:58.761 DEBUG 50743104 IAMAdfsCredentialsProvider::FormBasedAuthentication: verifySSL: false
May 09 01:40:58.761 DEBUG 50743104 IAMAdfsCredentialsProvider::FormBasedAuthentication: Using URI: https://win-qikm653mpj9.arunad.local:443/adfs/ls/IdpInitiatedSignOn.aspx?loginToRp=urn:amazon:webservices
May 09 01:40:58.821 DEBUG 50743104 IAMSamlPluginCredentialsProvider::GetAWSCredentialsWithSaml: Using RoleArn: arn:aws:iam::143280751103:role/ArunADFSTest, PrincipalArn: arn:aws:iam::143280751103:saml-provider/MytestADFS

Conclusion

This post demonstrated how to configure ADFS 3.0 on your Active Directory and use it as an IdP to federate into AWS using SAML. This post also showed how you can integrate your Athena ODBC driver to ADFS and use your Active Directory credentials directly to connect to Athena. Integrating your Active Directory with the Athena ODBC driver gives you the flexibility to access Athena from BI tools you’re already familiar with and analyze the data in Amazon S3 using SQL, without needing to create separate IAM users.

If your organization has single sign-on (SSO) into AWS enabled with the OKTA service provider, you can use the latest version of the Athena JDBC driver, version 2.0.9, to use OKTA as the authentication mechanism. For more information, see using OKTA in Athena JDBC

If you have any questions or feedback, please leave a comment.

 


About the Author

Alapati Arun is a Cloud Support Engineer with AWS based out of Dallas. He focuses on supporting customers in using big data technologies. He enjoys travel and watching movies.