AWS Storage Blog
Modernizing NASCAR’s multi-PB media archive at speed with AWS Storage
The National Association for Stock Car Auto Racing (NASCAR) is the sanctioning body for the No. 1 form of motor sports in the United States, and owns 15 of the nation’s major motorsports entertainment facilities. About 15 years ago NASCAR began to collect all the video, audio, and image assets from over the last 70+ years that contain NASCAR content and put it all into a centralized location. This collection of media content grew over time, becoming one of the largest racing media archives in the world.
At NASCAR, I am responsible for the media and event technology implementations. My primary focus is on media system engineering and event information technology operations for NASCAR. I am accountable for our AWS solution architectures and their integrations with AWS services, along with our internally developed media-based workflows. The team of media technologists that I lead works closely with AWS solutions architects, engineering, and solution managers to build out new and exciting capabilities relevant to NASCAR’s strategic technology goals.
As NASCAR’s media library continued to grow, we were faced with potential resource challenges and shortcomings derived from our legacy data management infrastructure – from costs to staffing. With this in mind, we needed to focus on cost optimization and getting the most out of our data in terms of insights and innovation. In this blog, I discuss our reasoning and approach for migrating a large video archive to AWS, and then automating the workflows associated with archive and restore.
NASCAR media archive overview and challenges
Over time, our technologies and procedures for archiving our videos, images, and audio content at NASCAR have evolved. Over the past 10 years, the NASCAR media library had been housed in an LTO tape library that held over 8,600 LTO 6 tapes and a few thousand LTO 4 tapes. Additionally, we had an equal amount of LTO tapes stored at an offsite location for disaster recovery purposes.
The NASCAR library has grown so much that we now have a growth rate of roughly 1.5 PB – 2 PB of data per year. The accelerated growth rate made it increasingly challenging to keep up with data ingestion and the necessary LTO tape migration processes. It became a race to which scenario would happen first; would we run out of tape slots in the LTO library before the next version of LTO tapes were released and before the migration process was completed? Would we need to hire additional staff to simply load and unload LTO tapes from the library?
Given these possible outcomes, we reflected on our strategy and decided we must take a pit stop to evolve and modernize our data archiving process. It was at that moment that we made the call to adopt Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier) and Amazon S3 Glacier Deep Archive to maximize the value of our digital assets and gain new insights from our data. In this blog, I discuss our 18-month journey as we migrated our 15-PB media archive from offsite tape vaults to AWS. My goal of this blog is that you come away with some helpful takeaways as you consider migrating your own archives to AWS.
NASCAR archive content composition
NASCAR storage workflows are divided into two distinct areas – archive and backup.
The archive aspect of the NASCAR library focuses on our high-resolution Apple Pro Res codec format of the media file a.k.a mezzanine files that are highest-quality assets and, for the most part, cannot be recreated. In our previous workflow, we stored the mezzanine in an LTO library that we managed with archiving software. This archiving software created two copies of the file on LTO, in which one copy of the asset would remain in the LTO library, and the other would be exported to an offsite storage location. When a user needed to access the mezzanine file, they submitted a request through our media asset management system which initialized and automated the restore request to the archiving software that would then restore the file from LTO to the local SAN.
The backup aspect of the NASCAR library focuses on the lower resolution proxy files. These files are used for search, offline editing, and asset logging, and must must remain online 24×7 for instant access for users anywhere in the world. Our proxy files are generated from the mezzanine file, and in theory could be recreated. However, in a true disaster recovery situation, the time to regenerate nearly 500,000 hours of files by restoring the mezzanine files and transcoding them to create the proxy files would be impractical and inefficient. Given the value of the proxy to the overall workflow within NASCAR, we have currently opted for also migrating from our spinning disk disaster recovery solution to Amazon S3 Glacier Deep Archive. We chose to use S3 Glacier Deep Archive so that we could take advantage of the amazing cost savings provided by this very low-cost storage class, in large part because the files are only used as a disaster recovery workflow.
The proxy and mezzanine workflows are independent of each other. However, both workflows have been enhanced by migrating the workflows to an AWS Storage and serverless architecture in the cloud. In the case of the proxy backup workflow, we exclusively use Amazon S3 Glacier Flexible Retrieval, which allows us to durably and efficiently store our proxy backups that are rarely – if ever – accessed. For the mezzanine archive workflow, we’ve implemented S3 Standard, S3 Glacier Instant Retrieval, and S3 Glacier Deep Archive. The file objects are written to S3 in accordance with the NASCAR Library S3 Lifecycle strategy. Once the files age out from S3, we lifecycle them down to S3 Glacier Instant Retrieval and then S3 Glacier Deep Archive in perpetuity.
Figure 1: Media archive architecture overview
NASCAR’s Amazon S3 archive lifecycle strategy
Our archive and restore workflows are centered around the annual NASCAR race season, which consists of 38 race weekends. We’ve found that we can use lifecycle policies comprised of multiple Amazon S3 storage classes that are reflective of our seasonal and weekly race schedule. During the race week, the file objects are archived to S3 Standard, then the timer begins, and the lifecycle policies take effect.
Initially we used the S3 Standard, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive storage classes to maximize resources and save on costs based upon our file pattern utilization from the previous race season. However, with the release of S3 Glacier Instant Retrieval, we’ve since migrated away from S3 Glacier Flexible Retrieval for our mezzanines. Typically, during each race week restore requests are for files that were archived from the previous time that a race was held at the current racetrack. Most of the NASCAR racetracks only host 1 race in a calendar year. Because of the annual racing pattern, we now keep data in S3 Glacier Instant Retrieval for 2 years in order to keep restore requests between 3-5 minutes. With our current workflow at NASCAR, we enable the following age-based lifecycle strategy for each of the buckets:
- Mezzanines – Lifecycle down to S3 Glacier Instant Retrieval after 1 day for 730 days (2yrs) and then down to S3 Glacier Deep Archive in perpetuity.
- Sources – Lifecycle down to S3 Glacier Deep Archive after 1 day in perpetuity.
- Projects – Lifecycle down to S3 Glacier Deep Archive after 1 day in perpetuity.
- Proxy DR – Lifecycle down to S3 Glacier Deep Archive after 1 day in perpetuity.
We established the lifecycle policies listed previously for our newly uploaded content to our S3 buckets. We also created a series of prefixed scoped lifecycle policies that matched our file naming classifications. In doing so, the migrated assets that were recorded more than 2 years ago are placed into S3 Glacier Deep Archive, which offers us very low-cost and long-term storage.
Figure 2: NASCAR library lifecycle configuration
NASCAR’s video archive migration process
The process of migrating 15 PB of data from a legacy LTO library to the cloud can be complex. It consists of restoring all of the files on LTO, uploading them to Amazon S3 storage for archival and then inventorying the migration status in an Amazon DynamoDB table.
As a proof of concept, we first developed a semi-automated workflow comprised of a few Python scripts. The first script exported out the LTO tape list information, which contains all of the files that are on each LTO tape. The next script then allowed us to restore files from an LTO tape to disk. All the files were then restored to disk from the LTO tape and were verified. Next, we then started another script that handled the uploading of the assets to Amazon S3. Additionally, we set up an Amazon DynamoDB table with the desired file inventory schema that we needed. Finally, we set up an AWS Lambda function that gets triggered whenever a new object is written to the Amazon S3 buckets. The Lambda function’s sole purpose is to update the Amazon DynamoDB table with the file object information.
In order to expedite the process of fully automating the workflow, we used CloudFirst’s Rapid Migrate automation workflow. By using Rapid Migrate, we’re able to fully automate the LTO tape file inventory retrieval, LTO file list restores, and S3 upload functions of the previously mentioned workflow. Doing so sped up the migration process, enabling us to fully migrate our entire archive in just over a year.
Figure 3: Road to the automated archive architecture
NASCAR’s S3 archive process overview
Our next step in our Amazon S3 storage migration was to develop our version of the media asset archive pipeline. The pipeline was a needed component that automates the archival of video, audio, and image assets, which are connected into our Avid Media Central Cloud UX system. Creating the new archive architecture in AWS grants us full control over the archival process and removes our reliance on another third-party archive management system to manage the assets.
Figure 4: Upload archive architecture
The workflow is initialized when a new media asset is ingested into our Avid Media Central Cloud UX environment. After the video files are transcoded to our house standard, the hi-res essence of the source media is then uploaded to Amazon S3 by using the process identified in Figure 4. During the ingest process, Avid sends a post request to the archive API, which requires message body in the query parameters.
Message body:
{
"Filename": "AWESOME_VIDEO.mov",
"Bucket": "nascar-test-bucket-name",
"Path": "/test/path/to/file"
}
The API then sends the message to the Amazon Simple Queue Service (SQS) `Upload_Queue`.
Amazon SQS messages are consumed by a Python 3 script located in an on-premises Linux server and uploaded to the S3 bucket defined in the SQS message json “Bucket” element.
When the file is created in the S3 bucket, it’ll trigger the upload processor AWS Lambda function, which updates the Amazon DynamoDB table that serves as our file transaction log.
Lambda function:
NASCAR restore process overview
Once we had all our data in S3, we developed a workflow that restores and transfers the media assets from AWS to our SAN volumes located at the NASCAR Productions facility in Charlotte, NC.
Figure 5: The NASCAR automated restore architecture
Message body:
{
"Filename": "AWESOME_VIDEO.mov",
"Priority": "low",
"Path": "/stornext/TEST/AWS/Restore/TEST",
"Username": "speedy",
"workflowId": "27451458"
}
The API request sends the message to the AWS Step Function `Restore_DynamoPull_To_SQS`. The Step Function then checks the DynamoDB defined table to see if the asset is in DynamoDB. If the asset is in the DynamoDB defined table, the message is then sent to the SQS `High_Priority` or `Normal_Priority` queue based on the priority set in the API body.
SQS messages are consumed by the Upload Processor Lambda Function which checks to see if the file object currently resides in “S3 – Standard” or “S3 Glacier Instant Retrieval” storage. If the files are in S3 Standard storage, the “Copy Job Queue” is updated with the file information. The DynamoDB status is then set to `Standard Storage – Move to Copy SQS`. However, if the file object isn’t in S3 Standard or S3 Glacier Instant Retrieval, the restore requests are submitted to restore the file from S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive. Additionally, once the restore is submitted, the asset’s DynamoDB status is updated by the Lambda to `In Progress – Priority SQS`. Once the file object is restored from S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive to S3 Standard, the Lambda function “Update Copy SQS” is triggered. The Lambda function adds the following message to the SQS “Copy Job Queue.”
Finally, the Python3 script running on premises polls the Copy Job Queue every 10 seconds, to determine if any files are ready for downloading. If the script consumes a message from the queue, it downloads the file to our on-premises Quantum StorNext SAN volume via the dual 10Gbps AWS Direct Connect network paths. Once the file saves to the path defined previously, the script updates the DynamoDB status to “Restore Completed” and notifies Avid that the restore process has completed. Then, the end user receives an email notifying them that their file is ready to use for their Adobe Premiere project.
Conclusion
Modernizing the NASCAR technology stack by using AWS Storage is an incredibly monumental step in our overall cloud adoption journey at NASCAR. Migrating all our video, audio, and image libraries to the industry-leading scalability, data availability, security, and performance of Amazon S3 allows us to spend more time developing additional workflows that help our business. While we maintain the NASCAR historical media archive, Amazon S3 Glacier Flexible Retrieval, Amazon S3 Glacier Deep Archive, and now Amazon S3 Glacier Instant Retrieval, have enabled us to put the endless lifecycle management of LTO tape formats in the past. In doing so, Amazon S3 storage provides us great cost savings by using the many helpful S3 storage classes to optimize costs and performance for our video, audio, and image-based workflow. All while keeping the data protected with 11 9s of durability across all storage classes which is a significant improvement over our legacy LTO storage library.
The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.