AWS HPC Blog
Accelerating file reads with a storage caching server
HPC customers using AWS benefit from the reliability, scalability, and features of a variety of storage services like Amazon Simple Storage Service (Amazon S3), Amazon FSx and the Amazon Elastic File System.
HPC workloads that need very high throughput storage can usually benefit from additional caching. While solutions like Amazon File Cache and Amazon FSx for NetApp ONTAP’s FlexCache are available, there are many cases where a simple low-cost solution using Amazon Elastic Compute Cloud (Amazon EC2) instances is the right fit.
In this post we’ll show how you can create an Amazon EC2-based cache that provides 25 GBytes/s of read throughput for under $4/hour, and which can be scaled to almost unlimited throughput.
To make good use of this, a workload needs some easily-described characteristics. It should have a working data set small enough to fit into the RAM of a single EC2 instance. It’s primarily read-only (read/write caching might be possible, but we’re not covering it here), and it needs to be a filesystem with Linux support, in this example we use NFS.
How it works
This solution relies on the Linux OS file cache, which uses spare system RAM to cache file access. Mounting a filesystem via an Amazon EC2 instance will cache any file accessed. No additional software is needed.
The steps to create a cache are:
- Launch an Amazon EC2 instance
- Mount the source NFS filesystem on the instance
- Re-export the filesystem
- Point HPC instances to mount the caching instance export as read-only
Writes can continue to be made directly to the source filesystem. When a client accesses a file, the cache instance fetches it from the source filesystem and stores a copy in RAM. Subsequent reads from clients are accelerated by being read from cache instead of from the source instance.
Test environment
To test performance we created:
- a caching server using an AWS Graviton3-based c7gn.16xlarge instance (chosen because of its 200 Gb/s networking).
- a small and deliberately quite slow NFS server on a t2.micro instance.
- multiple test clients
The specifications for the instances we used are listed in Table 1.
Results
Single client instance results
To test the throughput on transfer-time for a single file, we created four test clients using c7gn.2xlarge instances.
We copied a 1 GB test file to the four client instances. First from the source NFS server directly, and then via the cache instance. We captured the time taken to copy the file in each scenario.
The first copy via the cache instance has comparable throughput to the NFS server.
Once the file is cached, subsequent reads came from system RAM and completed around 30x faster.
Maximum throughput results
Our next test was to find the maximum throughput from the cache instance. The c7gn.16xlarge has 200Gbps network throughput, so this test is aimed at finding out how close we can get to that maximum.
- The caching instance NFS server runs multiple parallel processes to get work done for clients. We increased the number of threads from the default of 8 to 32 in the NFS server config file (
nfs.conf
), allowing us to do more work in parallel. - We created a fleet of 50 spot instances. A script on each instance repeatedly copied a 4GB file via the cache instance to local
/tmp
which uses system RAM, avoiding any bottlenecks writing to local disk. We cleared local file cache between copies, so the data was always transferred from the caching instance. - To monitor cache instance throughput, we monitored the total network throughput at 1 minute intervals using Amazon CloudWatch.
- Figure 3 shows that file throughput reached 1,486,397,311,959 bytes per minute which is around 24.77 GBytes/s – the maximum network speed of the instance.
Deployment and further use cases
The configuration does not require software installation and is simple enough to configured using a small script when launching a general-purpose operating system like Amazon Linux 2023.
We can scale throughput using multiple caching instances in parallel with network load balancing – for example via the Amazon Route 53 DNS service or Amazon Elastic Load Balancing. Some possible scaling strategies are covered in detail in this post.
For instances in multiple Availability Zones, accessing files from storage in a single AZ, data transfer costs can be reduced by caching files in each AWS availability zone.
You can use this method as a translation/caching layer between different file systems, too. For example, as a cached NFS front end for Amazon S3, using Mountpoint for Amazon S3.
Considerations
Cache size
The amount of data that can be cached is limited by the system RAM of the Amazon EC2 caching instance. The solution works best with data sets that fit fully into instance RAM, although partial caching is possible. New instances are regularly released, and you can find instances with the right memory and network throughput from the EC2 console using Instance Types.
Resilience
A single server solution has limited fault tolerance so the risk and impact of instance failure should be considered. A single instance has automated recovery from hardware failure, for greater resilience other options include using autoscaling
First read penalty
The first read of a file will not come from the cache, but it’s possible to warm the cache by cat’ing files to /dev/null on the caching instance. This can be done as part of a script on launch. Linux will still cache partial files, so for maximum efficiency, the cache can be pre-warmed before all source files have finished writing, then a final cat will get recently written data.
NFS tuning
Default NFS server parameters may limit throughput on larger instances. During testing, we increased the number of NFS processes serving data, by editing the threads parameter in the [nfsd
] section of /etc/nfs.conf
(and we restarted the NFS server).
Conclusion
This post shows how you can create a simple low-cost caching solution that can be provisioned in minutes and which provided very high throughput – 25 GBytes/s in our example. The technique will work for any Linux mountable filesystem, not just NFS.