AWS Compute Blog
Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox
Jakub Wojciak, Senior Software Development Engineer
Throughout our careers, we developers keep adding new tools to our development toolboxes. These range from the programming languages we learn, use, and become experts in, to architectural components such as HTTP servers, load balancers, and databases (both relational and NoSQL).
I’d like to kick off a series of posts to introduce you to the architectural components of messaging solutions. Expand your toolbox with this indispensable tool for building modern, scalable services and applications. In the coming months, I will update this post with links that dive deeper into each topic and illustrate messaging use cases using Amazon Simple Queue Service (SQS) and Amazon Simple Notification Service (SNS).
What is messaging?
Messaging involves passing messages around, but it’s different from email or text messages, because it is intended for communication between software components, not between people. Enterprise messaging happens at a higher level than that of UDP packets or direct TCP connections (although it does frequently use these protocols).
A message typically contains the payload — whatever information your application sends: XML, JSON, binary data, and so on. You can also add optional attributes and metadata to a message.
A SQL or NoSQL database often has a server that stores data. Similarly, a messaging server or service allows a place for your messages to be stored temporarily and transmitted.
The queue and the topic
For a database service, the main resource is a table. In a messaging service, the two main resources are the queue and the topic.
A queue is like a buffer. You can put messages into a queue, and you can retrieve messages from a queue. The software that puts messages into a queue is called a message producer and the software that retrieves messages is called a message consumer.
A topic is like a broadcasting station. You can publish messages to a topic, and anyone interested in these messages can subscribe to the topic. Then, the interested parties are notified about the published messages. The software that broadcasts topics is called a topic publisher and the software that subscribes to broadcasts is called a topic subscriber.
When should you use messaging?
There are some common use cases that might instantly make you think “I should use messaging for that!” Here are some of these use cases (to be discussed in greater detail in future posts).
- Service-to-service communication
You have two services or systems that need to communicate with each other. Let’s say a website (the frontend) has to update customer’s delivery address in a customer relationship management (CRM) system (the backend). Alternatively, you can set up a load balancer in front of the backend CRM service and call its API actions directly from the frontend website. You can also set up a queue and have the frontend website code send messages to the queue and have the backend CRM service to consume them. - Asynchronous work item backlogs
You have a service that has to track a backlog of actions to be executed. Let’s say a hotel booking system needs to cancel a booking and this process takes a long time (from a few seconds to a minute). You can execute the cancellation synchronously, but then you risk annoying the customer who has to wait for the webpage to load. You can also track all pending cancellations in your database and keep polling and executing cancellations. Alternatively, you can put a message into a queue and have the same hotel booking system consume messages from that queue and perform asynchronous cancellations. - State change notifications
You have a service that manages some resource and other services that receive updates about changes to those resources. Let’s say an inventory tracking system tracks products stocked in a warehouse. Whenever the stock is sold out, the website must stop offering that product. Whenever the stock is close to being depleted, the purchasing system must place an order for more items. Those systems can keep querying the inventory system to learn about these changes (or even directly examine the database—yuck!). Alternatively, the inventory system can publish notifications about stock changes to a topic and any interested program can subscribe to learn about those changes.
When should you not use messaging?
According to the law of the instrument, “If all you have is a hammer, everything looks like a nail.” In other words, it’s important to know when a particular technology won’t fit well with your use case. For example, you have a relational database that you can store large binary files in… but you probably shouldn’t.
Messaging has its own set of commonly encountered anti-patterns (also to be discussed in greater detail in future posts).
- Message selection
It’s tempting to have the ability to receive messages selectively from a queue —that match a particular set of attributes, or even match an ad-hoc logical query. For example, a service requests a message with a particular attribute because it contains a response to another message that the service sent out. This can lead to a scenario where there are messages in the queue that no one is polling for and are never consumed. (Note: This problem doesn’t exist for message routing or filtering, which are evaluated when messages are sent to a destination queue or topic.) - Very large messages or files
Most messaging protocols and implementations work best with reasonably sized messages (in the tens or hundreds of KBs). As message sizes grow, it’s best to use a dedicated file (or blob) storage system, such as Amazon S3, and pass a reference to an object in that store in the message itself. A dedicated file (or blob) store typically has much better support for uploading data in chunks with the ability to retry or resume downloads from a particular fragment.
Key features of messaging systems
Messaging servers and services offer much more than just produce/consume or publish/subscribe functionality. Thus, although it might seem easy to create your own message passing implementation on top of your own data store, consider all the extra features that a full-fledged messaging service provides. Here’s a list of a few but not—by any means—all messaging features:
- Push or pull delivery
Most messaging services provide both options for consuming messages. Pull means continuously querying whether the messaging service has any new messages. Push means that the messaging service notifies you when a message is available. The notification about the new message might be a special packet sent over the messaging protocol. It might also be an HTTP call that the messaging service makes to your API endpoint. You can also use long-polling, which combines both push and pull functionality. - Dead letter queues
What can your application do if a queue contains a message that you can’t process? Most messaging services allow you to configure a dead-letter queue for messages that you fail to process a certain number of times. This makes it easy to set them aside for further inspection without blocking the queue processing or spending CPU cycles on a message that can never be consumed successfully. - Delay queues and scheduled messages
What if you want to postpone the processing of a particular message until a specific time? Many messaging services support setting a specific delivery time for a message. If you need to have a common delay for all messages, you can set up a delay queue. - Ordering, priorities, duplicates
Messaging services provide you with a variety of options that affect the delivery of messages:- A choice between ordered delivery with limited maximum throughput or unordered delivery with virtually unlimited throughput
- Message priorities, where a higher priority message can skip over other messages in the queue
- Transactionality or best-effort acknowledgments of messages
When designing your system with messaging in mind, ask yourself the following questions:
- Do you need to process messages exactly in the order in which they were sent?
- Could your application parallelize the workload and process messages out of order?
- Do you want your application to consume certain messages at a higher priority than other messages?
- What happens if your application fails to process a message midway? Can you handle processing the same message again?
How can you get started?
If you have to configure and start a messaging server, it might take an extra effort to start using messaging. Instead, you can start to use message queues and topics today, using Amazon SQS and Amazon SNS. For more information, visit the following resources, and get started creating message queues and topics with just a few API actions: