How WarpStream enables cost-effective low-latency streaming with Amazon S3 Express One Zone

WarpStream, an AWS Partner, is a drop-in replacement for Apache Kafka. WarpStream’s cloud-native architecture makes it as easy to deploy and manage as a stateless web server like NGINX. WarpStream clusters can scale up to handle multiple GiB-per-second workloads as quickly as compute resources are assigned and then scale back down to zero after the job is completed.

WarpStream is built directly on top of object storage, such as Amazon S3, and has no local disks, writing all data to object storage. This means you can deploy WarpStream across multiple AWS Availability Zones (AZs) and maintain a very cost-effective streaming data pipeline. In addition, this stateless architecture enables truly elastic scaling, so users can scale their clusters out and in, with no partition rebalances or leader elections.

This stateless architecture improves both the cost and complexity of running streaming data pipelines at scale, but there is a tradeoff: latency. WarpStream has higher latency than Kafka and similar systems. That’s because WarpStream writes and reads from object storage directly, with no intermediary disks. In our benchmarks, we achieve ~400-600ms p99 write latency, and < 1.5s p99 end to end latency using the Amazon S3 Standard storage class. These latencies account for the full WarpStream write and read path, including commits to WarpStream’s metadata store, and acknowledgements back to the clients. For most applications, this latency profile is effectively real time. However, some workloads require lower latency than what S3 Standard delivers.

To reduce latency, WarpStream supports writing to S3 Express One Zone, S3’s high-performance, single-Availability Zone storage class purpose-built to deliver consistent single-digit millisecond data access for your most frequently accessed data and latency-sensitive applications, instead of S3 Standard. With clusters backed by S3 Express One Zone, WarpStream’s customers can achieve 4x lower end-to-end latency (from producer to consumer), while also leveraging a stateless architecture backed by object storage.

The case for lower latency

Apache Kafka is a general-purpose streaming technology, and that means that it serves a wide variety of use cases.

Many organizations use Kafka to simply move data from one place to another and take advantage of Kafka’s scalability and distributed persistence to buffer data before writing to downstream systems. For example, you might have an application that generates a large volume of logs, and you want to stream those logs into a monitoring platform, a data warehouse, and a Security Information and Event Management (SIEM) system. In order to maintain a consistent source of truth for the data, and improve resiliency and availability by decoupling your application from these downstream systems, you can write the data to a Kafka topic first, and then subscribe downstream applications (such as a SIEM) to this topic.

This pipeline is real-time, fault-tolerant, and high volume, but it is not particularly latency sensitive. The ability to scale to serve a high volume of requests at a reasonable cost is a much more important design consideration than minimizing end to end latency.

However, there are many use cases where latency does matter. As a general-purpose protocol, Kafka’s abstractions enable event-driven architectures wherein messages written to a Kafka topic are used to trigger user-facing application behaviors. In these cases, Kafka is often in the critical path of a user-facing request.

Take, for example, a platform that provides frontend developers with the ability to build and deploy websites. When a change is published to a given website, the platform needs to communicate this change downstream. Specifically, when a change is published, the change should be shown to end users in near-real time so that developers working with the platform have a tight feedback loop and can easily validate their changes.

When a new version of the website is published, we will write a message to Kafka to record that event. This event can then be consumed by CDNs (Content Delivery Networks) that are responsible for serving the site to end users. The CDNs will use this event to evict the old version of the site from the cache and serve the new one.

Because our Kafka topic backs a feature that is in the critical path of the platform that the developer directly interacts with, reducing latency is an important consideration. However, unlike the logs use case, CDN cache eviction is not a high-volume event because website changes are real-world events that do not occur very frequently. In our experience, this observation can be generalized: most use cases are either high volume, or require low latency, but rarely both.

Using WarpStream backed with S3 Express One Zone, developers leveraging the Kafka protocol can choose to prioritize lower latency over low costs at high throughput, while still maintaining a stateless architecture for their streaming platform.

Maintaining high availability

One big difference between S3 Express One Zone and S3 Standard is that S3 Express One Zone only replicates data to a single Availability Zone. However, WarpStream users expect their cluster to survive the loss of an entire AZ. To account for this, we modified WarpStream to support writing data to multiple S3 directory buckets and only acknowledge writes once a quorum is achieved.

One big difference between S3 Express One Zone and S3 Standard is that S3 Express One Zone only replicates data to a single Availability Zone. However, WarpStream users expect their cluster to survive the loss of an entire AZ. To account for this, we modified WarpStream to support writing data to multiple S3 directory buckets and only acknowledge writes once a quorum is achieved.

This ensures that a WarpStream cluster can always tolerate the unlikely loss of an entire Availability Zone with no loss of availability or durability, even when using S3 Express One Zone.

Of course, it’s not a completely free lunch. As we are writing three copies to S3 Express One Zone instead of just one copy in S3 Standard, there are some cost implications for customers. Specifically, S3 Express One Zone’s pricing model differs from S3 standard in three ways.

First, costs for API requests are up to 50% lower than S3 Standard. This is great, but since we have to write to a quorum of 3 S3 directory buckets, it ends up being a wash.

Second, unlike S3 Standard, users are billed per GiB for data transferred via each API request(with first 512 KiB free per API call). To keep the latency reasonable, WarpStream writes files to object storage that contain records from multiple topic partitions and multiple topics every 250 ms by default. While writing larger objects is generally a cost-effective way to use Amazon S3 Standard, this behavior makes that leverage S3 Express One Zone more expensive than those that don’t. Higher write throughput results in larger objects being written to S3, and if write throughput exceeds 2 MiB/sec with the default 250ms flush interval, each object written to S3 will be greater than the 512 KiB per object that S3 Express One Zone provides for free.

Finally, S3 Express One Zone storage is 8x more expensive per GB than S3 standard. In practice, this has no material implications for WarpStream. The reason for that is while we can use S3 Express One Zone to land newly written data, subsequently the WarpStream Agents will compact the data out of the S3 Express One Zone and into the S3 Standard storage class. This creates a form of tiered storage within S3 itself and ensures that WarpStream users benefit from both the low latency of S3 Express One Zone for writes and the low storage costs of S3 Standard for long term storage. The best of both worlds!

Flexibility with a single architecture

The best part about using WarpStream backed with S3 Express One Zone to reduce latency is that we didn’t have to modify WarpStream’s architecture at all. S3 Express One Zone provides our customers with a big knob they can turn to trade off between costs and latency on a per-workload basis, with the same architecture and same code. This means WarpStream can be used for a wider variety of use-cases, and our users benefit too since they don’t have to maintain two different streaming systems for latency sensitive workloads, and cost sensitive workloads.

For example, many organizations use Kafka as one component in a complex chain of systems that ingest data. Take our CDN example from before. While the cache eviction use case is clearly latency sensitive, the same organization likely has another cluster ingesting application logs and telemetry data. The workload that this cluster serves is not latency sensitive; if logs take an extra half-second to be written to a topic and consumed into a downstream system, there is no business impact. In this case, it would make a lot more sense to use S3 Standard. With WarpStream, you can choose between lower latency clusters that cost more to run, or higher latency clusters that cost less, simply by choosing which version of Amazon S3 to back your cluster with.

Conclusion

WarpStream’s zero-disk architecture enables easy scaling and hands-off operations. Extending this design to use Amazon S3 Express One Zone enables WarpStream’s customers to reduce the latency profile of their workloads, with the primary tradeoff being increased cost. Users can choose between clusters backed by S3 Express One Zone for lower latency and higher cost, and S3 Standard for higher latency and lower cost, depending on the use case. And in both cases, WarpStream’s stateless design enables effortless scaling and minimal operational burden. With clusters backed with S3 Express One Zone, WarpStream’s customers can achieve 4x lower end-to-end latency (from producer to consumer), and enable lower-latency streaming workloads to run on a platform built from the ground up for the cloud.

Contact us or visit our AWS Marketplace page to learn more about how you can leverage WarpStream’s cloud-native design to keep costs in check and simplify your real time data operations. Or, if you’re ready to get started, sign up now and start on your own for free. All new signups receive $400 in free credit, with no expiration, and no credit card required to get started.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

AWS Storage Blog

How WarpStream enables cost-effective low-latency streaming with Amazon S3 Express One Zone

The case for lower latency

Maintaining high availability

Flexibility with a single architecture

Conclusion

Resources

Follow