AWS HPC Blog
Introducing the Spack Rolling Binary Cache hosted on AWS
Today we’re excited to announce the availability of a new public Spack Binary Cache, hosted on AWS. Spack users now have access to a public build cache hosted on Amazon Simple Storage Service (Amazon S3).
The use of this Binary Cache will result in up to 20x faster install times for common Spack packages. The work was a result of a collaboration between AWS, E4S, Kitware, and the Lawrence Livermore National Laboratory (LLNL) and is operated by the Spack open-source project team, which has done some amazing work supporting the HPC community.
Background
We often comment that HPC software is almost defined by its complexity. A large portion of the time, we’re talking about the truly vast dependency trees like the one in Figure 1.
Tracking the build dependencies of an application, installing, and maintaining them is a non-trivial task. Compiling them is also a complex procedure, full of nuanced application requirements, compiler flags, and optimizations. These build configurations make a significant performance difference when you’re running on specific CPU and GPU architectures. Before Spack, this whole process used to take days or weeks, and ate into a researcher’s time – getting in the way of the next discovery.
To AWS, this job of building software repeatedly (and reproducibly) looks like undifferentiated heavy lifting, and the Spack community thought the same. Spack is an open-source community project whose mission is to simplify the process of building these complicated stacks.
The Spack Binary Cache
Today at the International Supercomputing Conference (ISC’22) in Hamburg, the Spack team released version 0.18 of the Spack package manager, containing the Spack Rolling Binary Cache. This release adds a special new capability that significantly improves the installation times of common packages.
The Binary Cache will store pre-built versions of common libraries and application, dramatically reducing the installation time for most packages by up to 20x. This Binary Cache will be accessible to all Spack users, whether on-premises or in the cloud, and will contain builds for multiple compilers and architectures. Currently this consists of 700 distinct packages with two different operating systems, three architectures for a bit over 5100 total packages.
Spack simplifies building HPC codes by providing build recipes, dependency tracking, and provenance information. Spack makes the building, and subsequent management of HPC software stacks much simpler – however building complex software stacks is still a time-consuming exercise. This problem is sometimes exacerbated inside dynamic cutting-edge environments like the cloud because developers want to do complete rebuilds for new software releases, or even compiler versions, often immediately when they become available.
The Binary Cache lets you install a package based upon an existing installation, rather than having to recompile it from source. As Spack stores all the provenance information already, you can be sure you’re getting a build with the correct compiler, optimization flags and every dependency. This order-of-magnitude speedup for installing common packages enables builders to get running with their codes faster than ever.
By hosting this Binary Cache in Amazon S3 we are ensuring a scalable storage platform, allowing the cache to grow with the new releases and permutations. To maximize the availability of this valuable data we are deploying the Binary Cache via Amazon CloudFront. This enables regional caching of the data, resulting in higher bandwidth and lower latency accesses. This content delivery method is ideal for serving Spack’s global user base.
Automation to the rescue
To kick off the binary cache, the Spack team have populated it with over 700 common packages. Each package has been built with multiple compilers and for three different architectures – Intel, AMD and Arm64.
For that, there are two build farms, one hosted by AWS and another by the University of Oregon, which compile these packages and upload them to the binary cache.
Importantly, the packages in the binary cache are not static. The build process takes place as a Continuous Integration (CI) pipeline – triggered by updates in Spack itself. This means that upon a new Spack release, the latest versions of packages will be available in the Binary Cache. This project goes one step further and tracks the Spack develop branch – meaning that the latest and greatest package versions are compiled by the build farms and added to the Build Cache automatically. The best news – this benefit is automatic: add the new Build Cache mirror and Spack will automatically check it for the packages to install.
Running a build farm on AWS is a perfect fit, the elastic nature of EC2 adapts to spiky nature of the workload (large build demand for new release and potentially quiet between release cycles). Amazon EC2 offers over 400 instance types of different architecture, microarchitecture, and size, and with a choice of operating systems. Being able to closely replicate the end-users’ environment makes EC2 the perfect environment for CI/CD, and a fantastic home for a Spack build farm.
AWS ParallelCluster loves Spack
Fast software installs have a particular benefit in ephemeral environments like the cloud. With AWS ParallelCluster, you can create task-appropriate clusters on demand. However, whilst your software stack probably isn’t. Having invested so much time configuring and compiling packages – you’re unlikely to want to start from scratch with a new cluster. Here Spack is part of an ideal solution – rebuilding a predictable software stack every time. That means you can have a post-install script in ParallelCluster’s configuration install your favorite packages in a small number of minutes after the head node is created.
Installing WRF with the Spack Binary Cache
To demonstrate the power of using an S3 backed Binary Cache, we will install WRF. WRF is an industry standard numerical weather prediction code. As shown in Figure 1, WRF has a complex dependency tree (61 packages), it also takes a long time to compile. For this example, we compare the WRF full stack install times when using the new Spack Binary Cache on three different architectures. This includes all the 61 dependency packages (including the MPI install), to go from nothing to a working install of WRF.
First let’s add the Spack Binary Cache as a mirror and trust the keys:
$ spack mirror add binary_mirror https://binaries.spack.io/develop
$ spack buildcache keys --install --trust
Now, installing WRF is as simple as the following command:
$ spack install wrf
Since we configured the Build Cache mirror, Spack will automatically check it to resolve the package install, so we do not need to specify any extra commands. By default the Spack v0.18 release will reuses existing packages where possible, allowing us some flexibility when matching exact versions.
From the timing results in Figure 2 we can see how easy it has been to build a full software stack to support WRF, across the three differ architectures — the Intel-based C5n, AMD-based Hpc6a, and Arm-based Graviton2 C6g instances. We see that due to its size WRF represents a significant portion of this install time, but across all three platforms we are able to build a full working stack in around 20 minutes. This represents a significant usability improvement for researchers.
If you really still want to build from source, you can disable the use of the cache by adding the --no-cache
parameter to the install command.
Conclusion
At AWS we are always looking for ways to remove undifferentiated heavy lifting, which makes supporting the Spack project, and the HPC community a natural fit. We’re glad to have been able to contribute much of the infrastructure for the Spack Binary Cache project, and some expertise along the way.
With the Binary Cache available through Amazon CloudFront, all Spack users (in the cloud or not) will benefit from faster software installation times enabling them to focus on their research. If there’s anything the world has learned from the experience of the last two years – researchers’ time is incredibly valuable, so we need to make them the most productive people on Earth.
You can read more about the Binary Cache at the Spack v.018 release announcement. We also published a workshop on Using Spack on AWS ParallelCluster so you can try it out yourself.