AWS for Industries
AstraZeneca’s Drug Design Program Built using AWS wins Innovation Award
At the 20th anniversary of the Bio-IT World Conference, held in Boston, in May 2022, AstraZeneca received the BioIT World Innovative Practice Award for its novel Augmented Drug Design Platform, built on AWS.
Biopharma R&D productivity and success rates have been an industry-wide challenge, with the drug discovery process often compared to finding a needle in the haystack. Companies worldwide are leveraging technological innovations and strategic initiatives to translate science into molecules more quickly, cost-effectively, and with greater chances of success. AstraZeneca’s win comes in the light of its efforts towards significantly speeding up new drug development by transforming drug design using technology. This has been made possible by AstraZeneca’s relentless focus on unifying R&D data to make it FAIR (findable, accessible, interoperable, and reusable) with forward-thinking approaches to empower data science and AI downstream. The result? A re-imagined drug discovery process across therapy areas.
“In the twenty months since the program’s inception, we’ve deployed these technologies against 70% of our small molecule projects. It has significantly impacted several areas including molecular ideation, library design decisions, synthetic route planning, patent research and writing, and has already contributed to accelerations in the discovery pipeline,” said Anna Berg Åsberg, Vice President R&D IT, AstraZeneca.
Breaking data silos
Transforming a global pharmaceutical company into a data-driven organization isn’t a simple task. “The challenge for AstraZeneca’s R&D”, Åsberg explains, “was to break down multi-modal and multi-source silos of chemical data, and generate actionable insights from these disjointed data pools.”
AstraZeneca uses AWS Data Migration Service to unify data silos and create a unified data hub with a hybrid architecture. As the foundational building block of AstraZeneca’s solution, the data hub extracts, ingests, standardizes, transforms, and centralizes data from numerous researchers, labs, on-premises data centers and third parties for further analytics downstream. The solution uses Amazon Simple Storage Service (Amazon S3), an object storage service, offering industry-leading scalability, data availability, security and performance. As well as, Amazon Aurora PostgreSQL, a fully managed relational database engine, combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. The hub also streamlines access for researchers with needed throughput, scalability, and availability.
Figure 1: Architecture supporting a centralized data hub of chemistry data (from 2021 re:Invent presentation)
Overcoming capacity constraints
Developing computational approaches to narrow down the search for drug-target interaction, and identifying potential molecules with minimum error rates, holds the key to successful drug discovery. However, the availability of computing resources to handle massive datasets in real-time has often limited computational scientists.
AstraZeneca uses Amazon Elastic Compute Cloud (Amazon EC2) to deliver a secure, reliable, high-performance, and cost-effective compute infrastructure. For scientists handling enormous datasets and complex algorithms this far exceeds what is possible with an on-premises solution. Amazon Elastic Kubernetes Service (Amazon EKS), a managed container service that runs and scales the microservice applications, further accelerates time to insights. The fully managed capabilities of AWS mean that AstraZeneca can focus on delivering scientific outcomes instead of standard maintenance tasks. Plus, its use of AWS Graviton processors has resulted in up to a 20 percent performance improvement for Amazon Aurora and up to 35 percent in cost savings.
The ability to search and query databases for similar molecules is another important tool for scientists. AstraZeneca uses the Amazon OpenSearch service to allow scientists to perform similarity and sub-structure searches in seconds across billions of molecules. This is enabled by automatically creating molecular fingerprints upon ingestion of data into the hub, which are then used for rapid searching. Traditionally this may have been delivered using a relational database, however using Amazon OpenSearch has refined the process for AstraZeneca, providing rapid search capability, and reduced support overhead.
Figure 2: Architecture supporting the performant and scalable search capability across billions of molecules
Tailored Solutions Powered by Machine Learning
The data science being done is not just for data science’s sake, it’s to really enable a chain of events and analytics further downstream. Comprehensive data unification and improved compute capabilities gives AstraZeneca the ability to unlock insights and make predictions in ways that were previously impossible.
Using AWS solutions has helped AstraZeneca to systematically incorporate the use of artificial intelligence (AI) and machine learning (ML) to help chemists extract insights in an effort to accelerate drug discovery pipelines and reduce time to market. “We have built sophisticated computational methods to predict what molecules we can make next and how to make them,” Åsberg says. “Traditionally, this process would take years. Now, chemists use AI to decide the best way to make the molecule in the shortest time.” In fact, 70 percent of AstraZeneca’s small molecule projects now use the company’s AI tools. “The chemists are trusting it, and they’re working with us to improve it every day,” says Åsberg.
The architecture below has been developed by AstraZeneca as a centralized platform for hosting and serving ML models for inference. To be used by multiple scientists at the same time, the design criteria for the platform was to provide scale, so it could serve hundreds of models with inference runs against millions of molecules. Similar to the creation of the molecular fingerprint, as new molecules are added/updated in the data hubs, a number of ADME (absorption, distribution, metabolism, excretion) and toxicity inference predictions are made, checking the potential impact of the molecule in the human body.
Figure 3: Architecture supporting model hosting and inference
However, Åsberg cautions that building an AI solution is only half of the battle, the bigger challenge is to ensure people trust it, use it, and see the value. AstraZeneca’s close-knit association with AWS is helping embed AI throughout the entire process from discovery to delivery—not only making it easy to use but navigating the change journey internally by earning scientists’ trust. Åsberg voiced about AWS, “They understand how we unite to support our research and development and brought the right service to us and the architectural guidance that helped us accomplish these incredible things.”
Global Scale, Individual needs
At AWS, we often talk about the art of what’s possible. This award is an ode to the outstanding work life science companies are doing to apply technology and strategic initiatives to transform the way they innovate across R&D with the ultimate aim of bringing innovative medicines to patients faster. At AstraZeneca, we see this in full display. What stands out is the effort being made to shift processes and data culture to empower data science and AI, while innovating using comprehensive tools and pioneering mechanisms for drug discovery.
Read the recap of AWS’ participation at the Bio-IT World Conference. To go more in-depth about AstraZeneca’s Augmented Drug Discovery platform and how it was built using AWS services watch this re:Invent 2021 presentation.
Learn more about AWS for health.