AWS for Industries
Migrate and Archive data for ADAS workloads on AWS
Background
Autonomous driving and advanced driver assistance systems (ADASs) require the processing of large and complex workloads in near real time. These workloads typically include tasks such as object detection and classification, lane detection, sensor fusion, path planning, and decision-making. The data from various sensors needs to be processed rapidly to enable the vehicle to perceive its environment, understand the context, and make decisions. ADAS workloads are constantly generating massive amounts of data, and the computational demands (use of GPUs and system-on-chip (SoC) solutions) of these workloads have driven the development of specialized hardware.
In this blog, we will discuss how to migrate and archive this massive amount of ADAS data from on-premise to Amazon Web Services (AWS). We will start looking at the ADAS data lifecycle management and the triggers for the data archiving strategy. We will also look at different migration and storage options available on AWS and how to choose the right AWS service for the right archiving use case.
Need for ADAS Data archiving
ADAS data archival is crucial for automotive original equipment manufacturers (OEMs) to verify the responsible development and deployment of autonomous and advanced driver assistance technologies.
In the event of an accident or system failure, OEMs can use the archived ADAS data to help investigate the root cause and identify areas for improvement. ADAS data archiving becomes important in scenarios, particularly during mergers and acquisitions in the automotive industry. Companies must integrate and harmonize data from various ADAS systems, enabling engineers to improve sensor fusion algorithms and enhance overall system reliability. Cost benefits and capacity management drive the need for archiving, as organizations seek to reduce storage expenses and optimize high-performance storage utilization. Additionally, preserving historical ADAS data supports ongoing research and development efforts, facilitating OEMs’ analysis of past driving scenarios and the development of advanced autonomous driving capabilities.
Data archiving strategy
We will refer to the following architecture which depicts the data migration and archiving strategy of legacy ADAS data from a customer data center into AWS.
Figure 1: ADAS Data archival on AWS
The data archiving strategy for ADAS workloads from on-premises to AWS goes through the following three-phases:
- Choosing the appropriate discovery and assessment tool for storage systems
- Choosing the appropriate migration tool for storing the ADAS data
- Choosing the appropriate AWS storage service and tier for archiving the ADAS data
Let’s dive into each of the phases.
a. Discovery and assessment
This phase may be an optional step for customers who do not know where the biggest chunk of data resides or its characteristics – hot vs cold data, write intensiveness, data type, performance etc. In such a scenario, a proper discovery and assessment is crucial in verifying the successful migration and archiving of ADAS data to the AWS Cloud. To help undertake this process, organizations can use a variety of tools and techniques.
AWS DataSync Discovery and AWS Migration Evaluator can be used to help customers gather a comprehensive summary about their current ADAS data storage, including on-premises Network attached storage (NAS) storage systems that hosts ADAS data and provide suggestions on AWS Storage services. AWS Migration Evaluator also helps customers create a directional business case for AWS Cloud planning and migration.
With discovery and assessment, organizations can make more informed decisions by building a business case, analyze total cost of ownership, and facilitate a more seamless migration to AWS.
b. Migration and transfer
To transfer data to AWS, customers have several options available, depending on their specific requirements and the nature of the data. Let’s look at some of the online/offline data transfer AWS services.
Offline data transfer options
AWS Snowball
ADAS customers can use AWS Snowball to help address some of the unique data migration challenges they face in their industry. The development and testing of ADAS technologies generates massive volumes of data, often reaching petabyte-scale, which quickly outpaces the capabilities of traditional internet-based transfer methods. This data is typically collected in remote testing locations or isolated test tracks where high-speed internet is either unavailable or prohibitively expensive. Given the time-sensitive nature of ADAS development cycles, rapid data transfer is crucial for quick analysis and iterative improvements. Snowball helps address this need by significantly reducing transfer times when compared to uploading over limited bandwidth connections. Security is another paramount concern, as ADAS data can contains valuable intellectual property. Snowball’s tamper-resistant enclosures and end-to-end encryption helps customers protect their information during physical transport. From a financial perspective, Snowball can provide a more cost-effective option than upgrading network infrastructure at multiple test sites, especially when dealing with large datasets. Its near seamless integration with AWS services facilitates a more efficient transfer to cloud-based analytics and machine learning solutions, which are essential for processing ADAS data. The device’s ability to handle both large files (like video) and numerous small files (sensor data) makes it a well-suited option for the diverse data types generated in ADAS testing. AWS Snowball standardizes data transfer across test sites, helping customers achieve consistent data management for ADAS development. This allows companies to focus more on innovation and analysis, accelerating autonomous driving technology advancement by eliminating data transfer logistical challenges.
AWS Snowball provides two main choices for data transfers: the Network file storage (NFS) adapter for file-based migration and the Amazon S3 adapter for S3-compatible workflows. For data transfer, customers can use AWS CLI, the AWS Snow Transfer Tool for large-scale migrations, or the high-performance s5cmd tool. These options cater to various use cases, from large-scale file migrations in automotive testing to object storage applications like archiving and content distribution, supporting efficient data transfer from on-premises systems to AWS.
Online data transfer options:
AWS offers diverse online data transfer options complementing Snowball’s physical transfer capabilities. AWS DataSync simplifies data migration and helps customers quickly, easily, and securely transfer data from on-premises to AWS, while AWS Storage Gateway extends local storage to the cloud. Large-scale migrations benefit from engaging AWS account teams to utilize their expertise, and help customers improve transfer speeds and align methods with project requirements. These solutions cater to various data management needs, from cloud transitions to handling growing data volumes in modern industrial processes.
AWS DataSync:
AWS DataSync offers ADAS customers a powerful solution for online data migration, enabling a more efficient transfer of large datasets between on-premises storage and AWS, or between different AWS storage services and regions. It helps customers streamline the process of moving vast amounts of sensor data, training datasets, and simulation results to the cloud, accelerating development cycles and enhancing access to AWS’s advanced analytics and machine learning capabilities. AWS DataSync helps customers preserve data integrity through built-in validation mechanisms, and is capable of performing automatic checksums to help customers verify that data is transferred accurately and completely. This integrity check is vital for ADAS applications, where data accuracy is paramount for customers as they develop and test critical systems.
c. Data archiving
AWS provides comprehensive data archiving solutions to meet the diverse needs of ADAS data storage and retrieval. One such storage service is the Amazon S3 Glacier storage classes, which are long-term, secure, durable storage classes for data archiving at the lowest cost in the cloud and milliseconds access. There are three distinct tiers:
- Amazon S3 Glacier Instant Retrieval is an archive storage class that provides up to 68% lower cost than standard S3 for long-lived data that requires retrieval in milliseconds. Many ADAS specific use cases, such as incident investigation and analysis, are time-sensitive but require infrequent access to data. This tier is well suited for such use cases that requires occasional access, such as once per quarter.
- Amazon S3 Glacier Flexible Retrieval delivers low-cost storage for archive data that is accessed 1–2 times per year and is retrieved asynchronously. The retrieval time depends on the type of retrieval you choose. For ADAS use cases (Cross-generational vehicle or software version comparison studies) that requires less frequent access, this tier offers a flexible and cost-effective solution.
- Amazon S3 Glacier Deep Archive is the lowest-cost storage class and supports long-term retention and digital preservation for data that may be accessed once or twice in a year, such as legal or regulatory data. With retrieval times ranging from 12 to 48 hours, this tier is often the most cost-effective option for long-term, deep archival storage of ADAS data.
We recommend storing ADAS data first on Amazon S3, tag each S3 bucket (ADAS project related) and objects uniquely. Customers can them move data using lifecycle policies to Amazon S3 Glacier. Customers can reuse the same tags on the S3 object to restore the data and put it on to one of the Amazon S3 Storage Classes.
Conclusion
The key steps in this data archiving journey include discovery and assessment of the existing ADAS data landscape, selecting the right data migration tools and techniques, and using the scalable and cost-effective storage solutions offered by AWS. By using the power of AWS Snowball for offline data transfers, AWS DataSync for more seamless online migrations, and the Amazon S3 Glacier storage classes for more flexible and durable archival storage, organizations can build a more future-proof data management infrastructure to support their ADAS development and deployment efforts. As next steps, we recommend reading the AWS Snowball documentation for guidance and details on the AWS Snowball hardware devices. We also recommend running through AWS Snowball workshop and AWS DataSync workshop for step-by-step guidance on data migration.