Front-End Web & Mobile
Creating serverless GraphQL APIs from RDS databases with AWS AppSync and PostGraphile
August 21, 2024: We recommend using Amplify Gen 2 to connect your app to existing MySQL and PostgreSQL database.
GraphQL is a query language for APIs that provides an understandable description of the data in your API, and that allows clients to ask for data in the shape that they need it. GraphQL helps developers implement applications faster, by giving developers the ability to query multiple databases, microservices, and APIs with a single GraphQL API endpoint. As a serverless service that scales on demand, AWS AppSync makes it easy for developers to create production-ready GraphQL APIs that can connect to many different types of datasources in their AWS account, such as PostgreSQL databases.
PostgreSQL is one of the most popular SQL databases and is used to power many applications. Developers often want to expose their data to their applications, without making their database publicly available. The Backend For Frontend (BFF) pattern is often used to deploy application-specific APIs that allow clients to access the data that they need. GraphQL APIs can be used to make the right data available to apps in the necessary format. In a previous post, we had shown how to create an AWS AppSync API that uses an existing SQL database as a datasource by leveraging AWS Lambda functions and Amazon RDS Proxy. However, generating a schema from an existing database is a complicated process that is hard to solve manually.
In this post, we present a solution that leverages PostGraphile to automatically generate an AppSync compliant schema from PostgreSQL tables, and uses Lambda functions to resolve GraphQL queries against a PostgreSQL database in Amazon Relational Database Service (Amazon RDS). The solution is serverless, and it can be deployed in a few clicks. It uses the AWS Cloud Development Kit (AWS CDK), doesn’t require writing any code, supports subscriptions, and works with any PostgreSQL database, such as Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL.
Solution overview
Visit the GitHub repo to deploy the solution.
The diagram above provides an overview of the deployed solution. It consists of one AWS AppSync API and two Lambda functions: a provider
function, and a resolver
function. Both functions use enhanced VPC networking to connect to the RDS Proxy. The provider
function is responsible for generating the GraphQL schema, updating the AppSync API, and generating a Lambda layer with the cached schema that is attached to the resolver
function. The resolver function is a generic function that resolves GraphQL operations from AppSync. The resolver
function behavior depends on the cached schema in its attached Lambda layer.
Both functions leverage PostGraphile to interact with the database. PostGraphile is an open source tool that can automatically detect tables, columns, indexes, relationships, views, types, functions, comments, and more. It can generate a schema from a PostgreSQL database and resolve GraphQL queries.
Here is how the solution works:
- Start by deploying the CDK-based solution. The solution creates an AWS AppSync API with a datasource that uses the
resolver
Lambda function, and an AppSync function that uses that datasource. - Once the solution is deployed, a user runs the
provider
function to analyze the RDS PostgreSQL database and generate the GraphQL schema. - The
provider
function retrieves schema information from RDS database using PostGraphile. - The provider updates the Layer function attached to the resolver Lambda function and updates the AWS AppSync API. It updates the schema, and properly sets up the queries, mutations, and subscriptions. Note that a user can repeat Step 1 at any time (e.g., after a database schema change) to update the AWS AppSync API definition.
- The AWS AppSync API is now ready to process requests. A GraphQL request is made.
- AWS AppSync authorizes the request using the configured Authorization Mode (API KEY, Cognito User Pool, etc.).
- AWS AppSync resolves the request by calling the attached Direct Lambda Resolver. The identity of the user is included in the request to the resolver Lambda function.
- The Lambda function resolves the query using the PostGraphile schema and RDS database.
Getting started
To get started, we need the following to enable connections to our database:
- an RDS Postgres database
- an RDS Proxy associated with our RDS Postgres database
- we use AWS Identity and Access Management (IAM) authentication for databases
- and we securely store credentials in AWS Secrets Manager.
- at least one private subnet with an NAT that our Lambda function ENIs will be deployed into, and a VPC Security Group
- this security group must be an allowed source of traffic for our RDS Proxy security group
We also must know the following information about our Postgres database:
- database to connect to
- schema(s) of interest (containing our tables and functions)
- username/role to use to execute queries. This role should have the scoped-down privileges required to access the schema(s). See this AWS post for more details on security best practices for Amazon RDS for PostgreSQL. The
provider
uses thepostgres
role for configuration. Theresolver
uses ourprovided
username/role to run queries.
We can deploy the solution by visiting this GitHub repo. The repo also provides a helper CDK app that we can use to deploy a sample VPC configured with a sample RDS PostgreSQL database, RDS Proxy, and sample data.
To deploy the solution, we run the following commands after updating with our account values.
Then, from the appsync-with-postgraphile directory, we run the update command to update our AWS AppSync API with the latest database schema definition.
npm run update
Interacting with the API
To showcase the functionality, let’s use a deployed solution generated from the sample RDS schema provided in the GitHub repo. The schema defines a person
table and a post
table (partially shown as follows). A person can own multiple posts.
Once deployed, we can interact with the API directly from the AWS AppSync Console. The generated API lets us interact with the different data types. With PostGraphile, relationships between tables are automatically discovered. For example, we can query for people, and we can get the list of posts associated with each person (through the reference on author_username
) in a single query. The resolver
Lambda functions uses the PostGraphile library to execute an optimized SQL query against our database to resolve the GraphQL query.
Leveraging row-level security
PostgreSQL has strong row-level security policies that we can utilize with PostGraphile and AWS AppSync. In the solution, when a GraphQL query is made, the identity
object from AWS AppSync’s $context variable is passed to PostGraphile, which is then made available through current_setting(...)
within PostgreSQL. Then, the identity values can be leveraged by row level policies.
We can leverage this in our solution by changing the default authorization mode of the API from API_KEY
to AMAZON_COGNITO_USER_POOLS
, and specifying the Cognito User Pool to use for authorization. We can do this by visiting the Settings page of our API in the AppSync console. Our demo person table is preloaded with a person with the username “johndoe”. We create a “johndoe” user in our Cognito User Pool and log in as that user on the AppSync console Queries page. Now, when we make a request, the following information is made available to PostgreSQL current session:
In our schema, the following policy restricts updates to rows in the person
table.
Now, when we try to update a person using the updatePerson
mutation, the mutation is only allowed if the requester’s username (appsync.identity_username
) matches the username value of the row that is being edited.
We can also leverage identity to implement custom functions. For example, the current_person
function uses the appsync.identity_username
setting to retrieve the information in the person table about the user making the request.
The current_person
function is available as a query in the schema as the currentPerson
query field.
Subscriptions and Pub/Sub features
One of the more challenging GraphQL features to implement is subscriptions. AWS AppSync provides support for subscriptions out of the box, so there’s no specific implementation required to get Pub/Sub features from the API. Each generated subscriptions supports a filter argument that clients can use to filter subscriptions using AppSync’s Enhanced Subscription Filtering.
The following is an example of filtering subscriptions where the value of the about field begins with “about”.
With Enhanced Subscription Filtering, we can also add filtering on the server side. For example, we can filter subscriptions to restrict access based on the identity of the user making the request by updating the onUpdatePerson
response mapping template.
Conclusion
AWS AppSync is a service that allows developers to build scalable, serverless GraphQL APIs. AWS AppSync offers automatic Pub/Sub functionality with subscriptions that we can use to implement real-time features in our applications. AWS AppSync’s strong authorization and security feature allows developers to easily authorize requests and implement rules to restrict data access to the right identities. AWS AppSync’s integrated Query editor in the AWS AppSync Console makes it simple to interact and test APIs. By combining AWS AppSync with PostGraphile, developers can unlock the data that lives in their RDS PostgreSQL databases and make it accessible to their application without having to write schemas manually or implement custom business logic. This solution removes the heavy-duty work that is often needed to create custom APIs for databases. Furthermore, it allows developers to get started with a serverless AWS AppSync GraphQL API in a manner of minutes. Get started today by visiting the solution in the GitHub repo.