Networking & Content Delivery

How Glovo migrated their self-managed VPN solution to AWS Client VPN

How Glovo migrated their self-managed VPN solution to AWS Client VPN

In this blog post Glovo shares how they migrated their ~4000 TLS virtual private network (VPN) users from their self-managed OpenVPN Amazon Elastic Compute Cloud (Amazon EC2) solution to AWS Client VPN by also integrating with OneLogin for authentication and authorization.

Amazon Web Services (AWS) Client VPN is a managed client-based VPN service that enables you to securely access your AWS resources and resources in your on-premises network. It’s an elastic service that automatically scales up or down based on demand. It also allows organizations to leverage their existing identity stores, for example OneLogin (used by Glovo), by Security Assertion Markup Language (SAML) 2.0 integration.

Glovo is a world leader in the delivery sector. Originally born in Barcelona in 2015, it’s primarily engaged in the home delivery of online food orders. Today, the app has a presence in 25 countries and more than 1,300 cities, with more than 150,000 partner restaurants and establishments. In addition to the best restaurants, it offers users all kinds of other establishments, including supermarkets, electrical, health and beauty, and gift stores, among others.

Glovo’s initial self-managed VPN solution

Previously, Glovo used a self-managed OpenVPN on Amazon EC2 instances fronted by a public facing Network Load Balancer (NLB) exposing the service to their end users. An Amazon EC2 auto scaling group (ASG) handled scale in/out activities by tracking relevant system metrics such as memory, disk, CPU or checking the health of the VPN process running on the Amazon EC2 instance.

Challenges with the current self-managed solution

Although the architecture was scalable and resilient by leveraging at least two Availability Zones (AZs), it still required Glovo to self-manage the infrastructure, adding operational overhead and cost. Glovo was responsible for maintaining the VPN Amazon Machine Image (AMI) with any relevant software patches and/or operating system upgrades plus managing any new AMI version deployment.

Another significant challenge for Glovo was related to managing authentication and authorization for their VPN users:

  • Authentication: As their current VPN software lacked integrations with external identity stores, Glovo leveraged an internally-built Token Service for granting temporary user passwords with every login. VPN users needed to manually generate a new password with every login attempt. The VPN software also needed internal connectivity with the Token Service, which was achieved by using AWS Transit Gateway. (The details behind the internal Token Service architecture are beyond the scope of this blog.)
  • Authorization: In order to grant a user access to specific internal networks, a custom/home-built script was developed for effectively mapping VPN users to specific networks by leveraging Linux iptables. This script was executed as part of the Amazon EC2 VPN installation and needed subsequent updates with any new VPN user profile creation/modification.

The following diagram (Figure 1) depicts Glovo’s initial VPN architecture and highlights all the points described earlier by also representing Authentication and Authorization flows:

Figure 1: Glovo’s initial self-managed VPN architecture

Designing for growth with AWS Client VPN

The new VPN architecture design tenets were defined based on the previous list of requirements and challenges:

  • Infrastructure management and scalability: The Client VPN is a regional service and Glovo deployed Client VPN endpoints associated with, at least, two AWS AZs. Being an AWS managed service there is no infrastructure to provision, maintain or scale. The service is also able to scale beyond ~4000 VPN users.
  • Authentication: The Token Service built at Glovo was not originally designed for VPN use and Glovo would rather manage VPN identities through OneLogin by using single sign-on (SSO), as they do with other applications. This also removes any manual login task from VPN users as they now use their unique OneLogin credentials.
  • Authorization: Glovo is now using the OneLogin user role (referred to as the “memberOf”attribute within the SAML 2.0 assertion) to define user-to-network access. This attribute is then mapped to Client VPN Authorization Rules by using “Access Group ID”.

The following diagram (Figure 2) depicts Glovo’s proposed VPN architecture:

Figure 2: Glovo’s new architecture based on AWS Client VPN

Figure 2: Glovo’s new architecture based on AWS Client VPN

In addition to covering the design tenets, the Glovo team also noted other enhancements provided by Client VPN:

  • Split-tunneling: Instead of sending all the VPN user traffic (even Internet traffic) through the VPN, now only intended VPC traffic is routed to AWS.
  • Agility and automation: Glovo leverages Terraform for managing Client VPN components (client endpoints, authorization rules). Glovo teams are now empowered to submit requests for additional endpoint/rule configurations. These requests are reviewed by relevant Security teams at Glovo and deployed once approved.

Another architecture decision made by Glovo was about using separate Client VPN endpoints for different VPN user groups. This configuration provides some benefits:

  • VPN user group segmentation as separate authorization rules are created for each VPN Endpoint
  • Reducing blast radius in case an undesired authorization rule or VPC endpoint configuration is deployed
  • Facilitating VPN user group traceability. Each Client VPN endpoint will source-NAT VPN traffic to its own elastic network interfaces (ENIs) in the subnet. Traffic from each endpoint is traceable beyond the VPC whereas before all the VPN user groups where source-NATed to the same Amazon EC2 instance ENI. For additional visibility into individual VPN users within a group, Glovo is leveraging connection logs with each endpoint.

Configuration steps

Now that we have described Glovo’s challenges and requirements, the following section covers the configuration steps in more detail.

AWS Client VPN and OneLogin integration via SAML 2.0

You must establish trust between the AWS service (Client VPN) and your Identity Provider (IdP) OneLogin in the case of Glovo’s implementation. This is done through exporting application-specific metadata information from the IdP and uploading it into your AWS account. Refer to Authenticate AWS Client VPN users with SAML and How to integrate AWS Client VPN with Azure Active Directory for more details as applicable.

For OneLogin, Glovo followed these steps:

  1. In the OneLogin administrator console, in the Applications tab select Add App. In the Find Applications window, locate the AWS ClientVPN.
Figure 3: AWS Client VPN app selection in OneLogin

Figure 3: AWS Client VPN app selection in OneLogin

  1. Provide a Name and Description for the application and click.
Figure 4: New AWS Client VPN local configuration in OneLogin

Figure 4: New AWS Client VPN local configuration in OneLogin

  1. Back in the Applications tab, select the newly created application. Select the SSO tab from the left side menu and ensure the Sign on method is SAML2.0. Download the metadata file by selecting More Actions and SAML Metadata on the top right corner.
Figure 5: AWS Client VPN app configuration in OneLogin

Figure 5: AWS Client VPN app configuration in OneLogin

Note that Issuer URL, SAML 2.0 Endpoint and SLO Endpoint should all be populated by OneLogin – they refer to the configuration endpoints used by OneLogin and AWS to exchange information. (These fields have been obfuscated in this blog.)

  1. The Client VPN requires a unique identity provider definition in AWS. Open the AWS Identity and Access Management (IAM) console and select Identity Providers. Fill in the relevant information while uploading the metadata file from OneLogin.
Figure 6: Configuring a new IAM Identity provider in the AWS Console

Figure 6: Configuring a new IAM Identity provider in the AWS Console

AWS Client VPN Self-Service Portal and OneLogin integration through SAML 2.0

If you enabled the self-service portal for your Client VPN endpoint, you can provide your clients with a self-service portal URL. Clients can access the portal in a web browser, and use their user-based credentials to log in. In the portal, clients can download the Client VPN endpoint configuration file and they can also download the latest version of the AWS provided client file (1 Client VPN Endpoint = 1 configuration file). Note that for SAML/SSO, you must use AWS provided client v1.2.0 or later.

For configuring this portal, both a new IAM Identity Provider definition and a separate OneLogin SAML configuration are needed.

Starting with the OneLogin configuration:

  1. In the OneLogin administrator console, in the Applications tab select Add App. In the Find Applications window, locate the AWS ClientVPN Self-Service Portal.
Figure 7: AWS Client VPN Self-Service app selection in OneLogin

Figure 7: AWS Client VPN Self-Service app selection in OneLogin

  1. Provide a Name and Description for the application and click.
Figure 8: New AWS Client VPN Self-Service local configuration in OneLogin

Figure 8: New AWS Client VPN Self-Service local configuration in OneLogin

  1. Back in the Applications tab, select the newly created application. Select the Configuration tab from the left side menu and enter the Client VPN Self-Service SAML endpoint https://self-service.clientvpn.amazonaws.com/api/auth/sso/saml as the Assertion Consumer Service URL (ACS). Save changes by pressing Save.
    Note that this is the endpoint where the Service Provider (Client VPN self-service) expects to receive SAML assertions after a user successfully authenticates.
Figure 9: AWS Client VPN Self-Service endpoint configuration

Figure 9: AWS Client VPN Self-Service endpoint configuration

  1. In the SSO tab, ensure the Sign on method is SAML2.0 and download the metadata file by selecting More Actions and SAML Metadata on the top right corner.
Figure 10: AWS Client VPN Self-Service app configuration in OneLogin

Figure 10: AWS Client VPN Self-Service app configuration in OneLogin

  1. Back to the IAM Console, you will also define a new identity provider and attach the new metadata file.
Figure 11: Configuring a new IAM Identity provider in the AWS Console

Figure 11: Configuring a new IAM Identity provider in the AWS Console

Once configured, when the user opens the AWS provided client on their device and initiates a connection to the Client VPN endpoint, they will be federated through OneLogin as described earlier. VPN users entering their respective VPN endpoint IDs in the self-service portal are authenticated through OneLogin. (Refer to the Authentication workflow in the Client VPN documentation.)

Migration Steps and Lessons Learned

The following steps were taken, at a high level, by the Glovo Edge Team to facilitate the migration between both VPN implementations, which coexisted for some time in order to eliminate any customer impact:

1) Once all the Client VPN infrastructure was in place, the Glovo Edge Team validated the following for every VPN Endpoint:

  • SAML integrations for both Client VPN and self-service portal access
  • Authorization Rules applied and working as intended
  • Terraform deployments for Client VPN configuration

2) Internal onboarding documentation for new VPN users was prepared, making sure every configuration item was covered. Other aspects aside from configuration, for example., how to request access to a new remote network for a department were also covered.

3) After sending internal communications, technical departments (roughly 500 people) were onboarded into the new VPN service. Onboarding technical users first helped facilitate overall troubleshooting and fix any configuration inconsistencies as a result of ad-hoc user configurations. Glovo also shared an interesting learning while working with SAML configurations:

  • SAML assertions and special characters. A small percentage of user departments at Glovo were mistakenly created with special characters like blank spaces at the end, which were not visible at the OneLogin User Interface. This caused issues with the SAML assertion and the Client VPN authorization rules. Glovo used tools like SAML Tracer to help troubleshoot these issues.

4) A few weeks later, the remaining employees were onboarded into the service. During this time, both self-managed and Client VPN implementations coexisted.

5) An official cutover date was announced. However, the Glovo Edge Team didn’t decommission the legacy service on the cutover date and instead defined an internal roadmap for removing access to legacy self-managed VPN by specific department/date. Each week after the cutover date, one or more departments were migrated off the old VPN service on to Client VPN.

6) After all the departments were migrated off the legacy VPN, the migration was considered completed and all the legacy infrastructure was decommissioned.

Final Outcome and Next Steps

In this post, we have covered how Glovo effectively migrated ~4000 VPN users from their self-managed VPN solution to Client VPN to access AWS resources.

Looking into the future, Glovo is already looking at two potential enhancements for their architecture:

  • Consolidating the number of Client VPN endpoints to avoid running into service quotas like the number of authorization rules defined for each VPN endpoint. VPC CIDR allocation and service provisioning was not optimally achieved at Glovo in the early days, which resulted in too many authorization rules needed for each VPN endpoint. Glovo is looking at refactoring their network, which will cause a reduction in the number of authorization rules needed, hence reducing the number of VPN Endpoints.
  • In April 2023, AWS announced the general availability of AWS Verified Access that enables customers to provide VPN-less, secure access to their corporate applications. Glovo is looking at this service for certain use cases in which some users just require HTTP connectivity to an internal application hosted on AWS, as it could be the case for certain Finance or Human Resources employees.

Further Reading

About the Authors

Carla Urrea Blazquez Headshot1.jpg

Carla Urrea Blazquez

Carla Urrea Blazquez is a Software Engineer at Glovo. She is part of the Glovo Edge Team, who designs and operates any type of public connectivity to Glovo’s AWS workloads, including employee access through VPN.

Miguel Rodriguez Vazquez Headshot1.jpg

Miguel Rodriguez Vazquez

Miguel Rodriguez Vazquez is a Sr. Solutions Architect at Amazon Web Services (AWS). Miguel joined AWS in 2015 and has 10+ years of experience in networking and overall architecture. As a Solutions Architect, he helps customers architect their solutions in AWS.