WeRide Speeds Autonomous Driving Machine Learning Model Training from Weeks to 12 Hours on AWS

2020

WeRide, founded in 2017, is an intelligent mobility company driven by artificial intelligence and centered on driverless technology, aiming to build a Level 4 (L4) fully autonomous driving system for the Chinese market and provide convenient and reliable new mobility services for the public.

With headquarters established in Guangzhou and operating with R&D departments located in Beijing, Shanghai, Anqing, and Silicon Valley respectively, WeRide globally employs 300 people, of which R&D engineers account for 70 percent. Ever since the alliance of Renault, Nissan, and Mitsubishi became the strategic lead investors at the Series A funding round in 2019, WeRide has had a head start in competing for the investment of worldwide carmakers among startup companies engaging in L4 autonomous driving. At present, WeRide has a driverless fleet with over 100 vehicles. Additionally, the total autonomous driving mileage has exceeded 2.6 million kilometers as of July 2020, followed by other remarkable records in China, including the first instance of road test under deluge, successful travel through a 1.5-kilometer tunnel without GPS signals, as well as the full open operation of Robotaxi. Driven by its goal of building a stable and reliable driverless system, WeRide is committed to gradually realizing pilot operation of the fully autonomous driving in the next 3 years, and increasingly broadening the operation of Robotaxi. 

start a python tutorial
kr_quotemark

The abundant functions and professional enterprise support services of AWS help us quickly build the industry-leading automatic driving training system, and shorten the model training time from 1-2 weeks to 12 hours."

Eric Huo
Data Team Director, WeRide

Challenges

As a company with a renowned brand in travel with an L4 autonomous driving system, WeRide has full-stack software and hardware solutions that lead the industry, including a high-precision maps and locating system, sensors, planning and control functions, simulation capabilities, data, and the unique L4 autonomous driving vehicles. Staff in the R&D and Operation departments of WeRide are busy collecting a tidal wave of road test data for training and simulation of autonomous driving models every day. With the rapid expansion of the R&D and Operation departments, the spike in the numbers and mileage of autonomous vehicles, and the growing accumulation of test data, the daily processing of terabytes of sensing data requires more flexible and extensible storage capacity and computing power to complete model training and manage the data lake.

Rapid growth puts high demands on WeRide's IT system, which is primarily manifested in three ways. First, substantial scalability of the system is required in a bid to achieve accelerated deployment of server capacity and furnish adequate storage space and computing power, in order to support swift product deployment. Second, the stability, reliability, and safety of the system must be maintained as the system scales. Third, WeRide must lower the baseline O&M expense of running the system.

"Since WeRide is a startup company limited in staff and IT resources, we hope to devote the human and material resources into the research and development of core technologies as much as possible," said Eric Huo, data team director at WeRide. To cope with these challenges and promote continuous business innovation, WeRide decided to use Amazon Web Services (AWS) to deploy a data processing platform and AI platform and build a safe, reliable, and manageable back-end processing system that can be rapidly expanded with the abundant functions and services of AWS.

Why AWS?

Among many cloud service providers, the main reasons WeRide selected AWS include its strong reputation in the industry, excellent global operation experience, extensive cloud service technology stack, and professional and abundant enterprise service experience. 

AWS solutions have comprehensive, fully-featured services that can meet the technical needs of WeRide. For example, as an innovator in autonomous driving sector, WeRide collects data from millions of kilometers of drive tests and uses AI/ML to process, analyze, and label the data, which requires significant computational power and vast amounts of storage. Amazon Simple Storage Service (Amazon S3) and Amazon S3 Glacier provide virtually unlimited storage capacity, which can be expanded at any time as required by the business. Amazon EC2 P3 instances provide high-performance computing power in the cloud, support up to eight NVIDIA V100 Tensor Core GPUs with up to 100 Gbps network throughput for machine learning applications. P3 instances achieve mixed precision performance of up to 1 Petaflop, which significantly accelerates the processing speed of ML training, thus enabling data scientists and machine learning engineers to iterate faster, train more models, and improve autonomous driving accuracy.

"We used to build a high-performance machine learning system by ourselves before, which not only required a lot of staff but also made it difficult to achieve the scalability and flexibility needed to meet the requirement of rapid business development. With Amazon EC2 P3 instances, we can quickly build a distributed machine learning cluster to gain sufficient computing power and greatly shorten the training time of automatic driving model. It takes 1-2 weeks on average for the industry to complete a training model at present, while we only need 12 hours on AWS," says Huo. In addition, when the team needs to upload terabytes of data to the AWS Cloud, AWS Snowball can complete the task, and, when more computing power is required for simulation and demonstration, the team can leverage Amazon EC2 Spot Instances to meet the system requirements while saving on cost.

In addition, the AWS Enterprise Support team has abundant experience in enterprise service, and different levels of technical support services at different stages are available for customers according to their specific requirements. "Although members in our technical team have considerable knowledge and experience in the AWS Cloud, services from AWS Enterprise Support still help us greatly," says Huo. AWS Enterprise Support not only provides WeRide with technical support services to timely solve the problems encountered in the application but also helps WeRide with architecture design, machine learning cluster building, and cost control. It also shares AWS best practices from its experience in the field of autonomous driving.

Benefits

In early 2019, WeRide began to deploy data processing and machine learning platforms on AWS. Specifically, Amazon EC2 P3 instances are used to quickly build a distributed machine learning cluster, with terabytes of data captured by sensor-laden vehicles transmitted to the AWS Cloud through AWS Snowball for model training; Amazon EMR, Amazon Relational Database Service (Amazon RDS), Amazon DynamoDB, Amazon Aurora, and other services were used to build a data lake and complete various data analysis and processing tasks; Amazon CloudWatch, AWS CloudTrail, and other services are used for system operation and maintenance management. All in all, a safe, stable and flexible machine learning system has been successfully built. At present, the AWS services used by WeRide mainly include Amazon EC2, Amazon S3, AWS Snowball , Amazon Elastic Container Registry (ECR), Amazon EMR, Amazon RDS, Amazon DynamoDB, Amazon Aurora, Amazon Elastic Block Store (EBS), Amazon CloudWatch, AWS CloudTrail, AWS Direct Connect, and AWS Enterprise Support services.

Building on AWS has benefited WeRide mainly in three aspects. First, the deployment efficiency of the WeRide business system has been greatly improved, and the time for new business deployment and response has been shortened to weeks, so that the demands for computing and storage from technical development team can be met more quickly. Second, the use cost and O&M cost of IT resources have been reduced, and the total cost of ownership (TCO) of the system was reduced by one-third, with the O&M efficiency improved by 50 percent. Third, the overall security and reliability of the system has improved. The deployment of AWS across numerous Regions and Availability Zones ensures the overall availability and data persistence of the system. Finally, the abundant security management functions of AWS enable WeRide to conveniently build its security management system.

At present, the machine learning system deployed by WeRide on AWS is already equal to the highest level of excellence in the industry, and model training which usually takes 1-2 weeks can be completed in only 12 hours on WeRide's new system. In the future, WeRide plans to deploy more AWS solutions to support additional business-related computing tasks. "We pursue efficiency. Choosing AWS enables us not only to quickly complete the system deployment, but also to ensure the stability, reliability, and security of the system," says Huo.

Learn more about AWS for Automotive: aws.amazon.com/automotive

WeRide’s Autonomous Driving on AWS

WeRide's Autonomous Driving Technology on AWS (3:55)

About WeRide

WeRide, founded in 2017, is an intelligent travel company driven by artificial intelligence and centered on driverless technology, aiming to build a Level 4 fully autonomous driving system for the Chinese market and provide convenient and reliable new travel services for the public.

Benefits of AWS

  • Increased deployment efficiency
  • Reduced TCO by one-third, O&M efficiency improved by 50 percent
  • Improvements in overall security and reliability
  • Reduced machine learning model training time from 1-2 weeks to 12 hours

AWS Services Used

Amazon S3

Amazon Simple Storage Service (Amazon S3) is an object storage service that provides industry-leading scalability, data availability, security and performance.

Learn more

Amazon EC2 P3 Instances

Amazon EC2 P3 Instances can offer high-performance computing in the cloud, support up to eight NVIDIA® V100 Tensor Core GPUs, and provide network throughput of up to 100 Gbps for machine learning and HPC applications.

Learn more

Amazon RDS

Amazon Relational Database Service (Amazon RDS) enables you to easily set up, operate and extend relational databases in the cloud.

Learn more

Amazon EC2 Spot Instances

Amazon EC2 Spot Instances allow you to take advantage of unused EC2 capacity in the AWS Cloud.

Learn more


Get started

Companies of all sizes in all sectors and industries are using AWS to transform their routine businesses. Contact our experts and embark your journey to AWS Cloud immediately.