AWS Big Data Blog
Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 2
In the first post of this series, Federating access to your Amazon Redshift cluster with Active Directory: Part 1, you set up Microsoft Active Directory Federation Services (AD FS) and Security Assertion Markup Language (SAML) based authentication and tested the SAML federation using a web browser.
In Part 2, you learn to set up an Amazon Redshift cluster and use federated authentication with AD FS to connect from a JDBC SQL client tool. You follow a detailed step-by-step process of using Amazon Redshift database authentication to simplify the credential management of database users and reuse what you may already have on premises. You can also integrate Amazon Redshift group-level privileges with federation.
You can use your corporate Active Directory (AD) and the SAML 2.0 AD FS identity provider to enable your users to easily access Amazon Redshift clusters using corporate user names without managing database users and passwords.
Overview of solution
This post walks you through setting up an Amazon Redshift cluster and federating database user authentication with AWS Identity and Access Management (IAM) and Amazon Redshift. You use the user you set up in your AD in Part 1 (Bob
) to authenticate using AD FS and control access to database objects based on the group the user is assigned to.
You also learn how to set up SQL Client (SQL Workbench/J) and an Amazon Redshift JDBC driver to connect to a database cluster using the database user authentication method and the AD FS federated authentication method.
The following diagram illustrates the architecture of this solution.
The workflow contains the following steps:
- The client (SQL Workbench/J) makes a request to the Identity Provider (IdP).
- The IdP authenticates the user and sends back a SAML assertion.
- The client calls an AWS SAML endpoint to
AssumeRoleWithSAML
. - AWS authenticates and returns temporary security credentials.
- The client uses temporary credentials to connect to the Amazon Redshift cluster endpoint.
Walkthrough
The walkthrough includes the following steps:
- Set up your Amazon Redshift cluster.
- Download, set up, configure, and test different connection methods to Amazon Redshift using SQL Workbench/J and an Amazon Redshift JDBC driver.
- Configure your Amazon Redshift connection profile with SQL Workbench/J.
Prerequisites
For this walkthrough, you should meet the following prerequisites:
- Complete all the steps from Federating access to your Amazon Redshift cluster with Active Directory: Part 1:
- Set up domain controller and AD FS
- Configure federation on AWS with the AD FS
- Configure AWS as the relying party with AD FS
- Test AWS SAML federation using your web browser
- Copy the username and passwords for the AD users to use in this post
- Copy the ARN for the two SAML roles you created to use in this post
- Copy the AD domain name to use in this post
- Have a basic understanding of SQL with any database and experience using a SQL client, such as SQL Workbench/J
- Check with your AWS administrator to ensure you have access to the AWS Management Console with permissions to use Amazon Redshift and IAM
Setting up your Amazon Redshift cluster
To create your Amazon Redshift cluster, complete the following steps:
- On the console, open Amazon Redshift.
- Choose Create cluster.
- For Node type¸ choose dc2.large.
- For Nodes, enter
2
.
- For Cluster identifier, enter
adfs-redshift-cluster
. - For Database port, enter
5439
. - Enter a master user name and password of your choice.
- Keep everything else at their defaults and choose Create cluster.
Your cluster status initially shows the various Modifying
stages.
When the cluster creation is complete, its status shows as Available
.
- Choose the cluster.
- On the Properties tab, review the endpoint information and database configuration.
- Make sure that for this walkthrough, Publicly accessible is set to Yes.
For more information about VPC accessibility, see Managing clusters in a VPC.
For production setup of the Amazon Redshift cluster, you shouldn’t set the cluster to be publicly accessible; instead you make it private. For more information, see How can I access a private Amazon Redshift cluster from my local machine?
Configuring your cluster security group
You can create a new security group for your cluster or use the default security group (located in the Network and security section on the Properties tab).
- For the security group, choose Edit inbound settings.
- Add a rule for Amazon Redshift for the IP range on your computer and the public IP for the Windows 2016 domain controller, which you set up in Part 1.
- Choose Save.
Configuring and testing connection methods using SQL Workbench/J and Amazon Redshift JDBC driver
Before you get started, download the latest Amazon Redshift JDBC driver with AWS SDK. You use this for JDBC driver-based authentication in SQL Workbench/J.
- Download and install SQL Workbench/J. For instructions, see Connect to your cluster by using SQL Workbench/J.
- Open SQL Workbench/J and choose Manage Drivers.
- Choose Amazon Redshift.
- For Library, browse to the JDBC driver you downloaded.
- Press Ok.
Configuring your Amazon Redshift connection profile with SQL Workbench/J
To set up federated access, you take a two-step approach: connecting with the superuser and connecting using federated authentication.
Connecting using the superuser
For this walkthrough, you create database objects, groups, and users and assign proper privileges to the groups on the database objects they are allowed to access.
- Choose New Connection.
- Construct a JDBC connection URL for the database standard user:
jdbc:redshift://<<redshiftclusterendpoint>>.redshift.amazonaws.com:5439/dev?ssl=true&UID=username&PWD=password
You can find the value for redshiftclusterendpoint on the Amazon Redshift console, under Properties, Endpoint. The username and password are the values you provided when you created the cluster.
- Choose Test.
You should receive a successful connection message, as in the following screenshot.
- Connect as
awsuser
(a superuser). - From SQL Workbench/J, enter the following commands to set up the following environment:
- Create two database groups:
- Create two schemas:
- Create two tables in each schema:
- Insert sample data into the two tables:
- Validate data is available in the tables:
You can now set up appropriate privileges for the sales
and marketing
groups. Groups are collections of users who are all granted privileges associated with the group. You can use groups to assign privileges by job function. For example, you can create different groups for sales, administration, and support and give the users in each group the appropriate access to the data they require for their work. You can grant or revoke privileges at the group level, and those changes apply to all members of the group, except for superusers.
- Enter the following SQL queries to grant access to all tables in the
sales
schema to thesales
group and access to all tables in themarketing
schema to themarketing
group:
Connect using federated authentication
You can pass the URL information in the connection profile two different ways:
- Specify the Amazon Redshift cluster endpoint with the port (for this use case, the driver doesn’t need to run the
DescribeClusters
API because everything is already available):
jdbc:redshift:iam//<<your redshift cluster endpoint>>/dev
- Specify the cluster ID and Region (for this use case, the driver calls the
DescribeClusters
API andGet-Cluster-Credentials
as additional steps):
jdbc:redshift:iam://<<cluster id:region>>/dev
The rest of the parameters are specified in the Extended Properties configuration.
For this walkthrough, you use the second method.
- Create a new connection profile in SQL Workbench/J.
- Choose Extended Properties.
- Provide the following values:
- idp_port –
443
- plugin_name –
com.amazon.redshift.plugin.AdfsCredentialsProvider
- ssl_insecure –
true
- idp_host – The host name for the AD domain, for example,
windows2016.adfsredshift.com
- preferred_role – The ARN of the SAML role you created in Part 1, for example,
arn:aws:iam::<AWSAccountID>:role/ADFZ-Production
(replace<AWSAccountID>
with your AWS Account) - AutoCreate –
true
Setting AutoCreate= True
creates the AD user Bob
in the Amazon Redshift database the first time without password settings. The authentication is temporary using the SAML configuration you completed in Part 1. You can also set the AutoCreate
property with IdP configuration. For more information, see JDBC and ODBC Options for Creating Database User Credentials.
- Choose Test.
You should see a successful connection, and can start using SQL Workbench/J to query Amazon Redshift cluster as user Bob. The message indicates Connection to endpoint successful. For instructions on addressing connection errors, see Troubleshooting connection issues in Amazon Redshift.
- With this connection profile, run the following query to test SAML authentication and query privilege control:
The following screenshot shows the output.
The output shows that the user bob@adfsredshift.com
was authenticated using AD FS. The user also joined the marketing group as enforced by the AD FS DbGroups
claim rule and the policy associated with the ADFZ-Production
role, which the user assumes during this session.
- To run the query against the
marketing
schema, enter the following code:
The following screenshot shows the output.
The output shows that AD user Bob
is part of the AD group RSDB-marketing
, which was mapped to the DB group marketing
. This database group had select access to the schema marketing
and all tables in that schema. Therefore, the user could successfully query the table through SQL Workbench/J.
- To run a query against the
sales
schema, enter the following code:
The following screenshot shows the output.
The output shows that Bob
is only part of the AD group RSDB-marketing
. Due to the way the claim rule is set up, Bob doesn’t have access to the database group sales
, and therefore the query returns with a permission denied error.
Cleaning up
To avoid incurring future charges, you can delete the resources by deleting the AWS CloudFormation stack. This cleans up all the resources from your AWS account that you set up in Part 1.
To delete the Amazon Redshift cluster you created, select the cluster and choose Delete from the Actions drop-down menu.
You’re prompted to take a final snapshot if you require the cluster for future testing or setup. Storing the snapshot incurs additional charges.
Conclusion
In the first of this two-part series, you simulated an on-premises AD and AD FS setup using a Windows 2016 Amazon Elastic Compute Cloud (Amazon EC2) instance, and tested authentication using a browser.
In this post, you set up an Amazon Redshift cluster and SQL Workbench/J client with an Amazon Redshift JDBC driver. You connected to the database cluster using a database user with administrator privileges to set up a test database environment, and used a federated user authentication with AD FS. In addition, you validated the configuration by running several queries to see how to control access to Amazon Redshift database objects by using groups and assigning users to specific groups using AD FS seamlessly.
About the Authors
Rajesh Francis is a Data Warehouse Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and works with customers to build scalable Analytic solutions.
Vaibhav Agrawal is an Analytics Specialist Solutions Architect at AWS. Throughout his career, he has focused on helping customers design and build well-architected analytics and decision support platforms.