Understanding message brokers. Learning the mechanics of messaging with ActiveMQ and Kafka. Chapter 1

Hi all!

Started translating a small book:
Β«Understanding Message Brokers",
author: Jakub Korab, publisher: O'Reilly Media, Inc., date of publication: June 2017, ISBN: 9781492049296.

From the introduction to the book:
"... This book will teach you how to think about brokered messaging systems by comparing and contrasting two popular brokering technologies: Apache ActiveMQ and Apache Kafka. It will outline use cases and development incentives that have led their developers to take vastly different approaches to the same area of ​​brokered messaging between systems. We'll take a look at these technologies from the ground up and highlight the impact of different design choices along the way. You will gain a deep understanding of both products, an understanding of how they should and should not be used, and an understanding of what to look out for when considering other messaging technologies in the future. ... Β»

Parts translated so far:
Chapter 1 Introduction
Chapter 3. Kafka

I will post completed chapters as they are translated.

CHAPTER 1

Introduction

Intersystem messaging is one of the least understood areas of IT. As a developer or architect, you may be very familiar with various frameworks and databases. However, it is likely that you only have a glimpse of how broker-based messaging technologies work. If that's how you feel, don't worry, you're in good company.

People usually have very limited contact with the messaging infrastructure. Often they connect to a system created a long time ago, or download a distribution kit from the Internet, install it in PROM and start writing code for it. After running the infrastructure in PROM, the results can be mixed: messages are lost on crashes, sends don't work as you expect, or brokers hang your producers or don't send messages to your consumers.

Sounds familiar?

A common scenario where your messaging code works fine, for the time being. Until it stops working. This period lulls vigilance and gives a false sense of security, which leads to even more code based on false ideas about the fundamental behavior of the technology. When things start to go wrong, you're faced with an uncomfortable truth: that you really didn't understand the underlying behavior of the product, or the trade-offs chosen by the authors, such as performance vs. robustness, or transactional vs. horizontal scalability.

Without a deep understanding of how brokers work, people make seemingly reasonable claims about their messaging systems, such as:

  • The system will never lose messages
  • Messages will be processed sequentially
  • Adding consumers will make the system faster
  • Messages will only be delivered once

Unfortunately, some of these statements are based on assumptions that only apply under certain circumstances, while others are simply not true.

This book will teach you how to reason about brokered messaging systems by comparing and contrasting two popular broker technologies: Apache ActiveMQ and Apache Kafka. It will outline use cases and development incentives that have led their developers to take vastly different approaches to the same area of ​​brokered messaging between systems. We'll take a look at these technologies from the ground up and highlight the impact of different design choices along the way. You will gain a deep understanding of both products, an understanding of how they should and should not be used, and an understanding of what to look out for when considering other messaging technologies in the future.

Before we start, let's go over the basics.

What is a messaging system and why is it needed

In order for two applications to communicate with each other, they must first define an interface. The definition of this interface includes the choice of a transport or protocol such as HTTP, MQTT, or SMTP, and negotiation of the message formats that the systems will exchange. This can be a strict process, such as defining an XML schema with payload cost requirements for a message, or it can be much less formal, such as an agreement between two developers that some part of an HTTP request will contain a client identifier. .

As long as the format of messages and the order in which they are sent are consistent between systems, they will be able to communicate with each other without worrying about the implementation of the other system. The internals of these systems, such as the programming language or framework used, may change over time. As long as the contract itself is maintained, the interaction can continue unchanged on the other side. The two systems are effectively decoupled (separated) by this interface.

Messaging systems typically involve an intermediary between two systems that interact to further decouple (separate) the sender from the recipient or recipients. In this case, the messaging system allows the sender to send a message without knowing where the recipient is located, whether he is active or how many of their instances.

Let's look at a couple of analogies for the kinds of problems a messaging system solves and introduce some basic terms.

Point-to-Point

Alexandra goes to the post office to send a package to Adam. She goes to the window and hands the parcel to the employee. The employee picks up the package and gives Alexandra a receipt. Adam does not need to be at home when the package is sent. Alexandra is confident that the package will be delivered to Adam at some point in the future and can continue to go about her business. Later, at some point, Adam receives a package.

This is an example of a messaging model point to point. The post office here acts as a package distribution mechanism, ensuring that each package is delivered once. The use of the post office separates the act of sending the parcel from the delivery of the parcel.
In classic messaging systems, the point-to-point model is implemented through queue. The queue acts as a FIFO (first in, first out) buffer that one or more consumers can subscribe to. Each message is delivered only one of the subscribed consumers. Queues usually try to distribute messages fairly among consumers. Only one consumer will receive this message.

The term "durable" is applied to queues. Reliability is a service property that guarantees that the messaging system will keep messages in the absence of active subscribers until the consumer subscribes to the message delivery queue.

Reliability is often confused with persistence and, although the two terms are interchangeable, they perform different functions. Persistence determines whether a message is written by the messaging system to some kind of storage between receiving it and sending it to the consumer. Messages sent to the queue may or may not be persistent.
Point-to-point messaging is used when a use case requires a single action on a message. Examples include depositing funds into an account or fulfilling a delivery order. We will discuss later why a messaging system by itself is incapable of providing a one-time delivery and why queues can at best provide a delivery guarantee. at least once.

Publisher-Subscriber

Gabriella dials the conference number. While she is connected to the conference, she hears everything the speaker says, along with the rest of the call participants. When she blacks out, she misses what is said. When reconnecting, she continues to hear what is being said.

This is an example of a messaging model publish-subscribe. The conference call acts as a broadcast mechanism. The person who is talking does not care how many people are currently on the call - the system ensures that anyone who is currently connected will hear what is being said.
In classic messaging systems, the publish-subscribe messaging model is implemented through tops. A topic provides the same broadcast method as the conferencing mechanism. When a message is posted to a topic, it is distributed for all subscribed users.

Topics usually unreliable (nondurable). Like a listener who cannot hear what is being said on a conference call, when the listener goes offline, topic subscribers miss any messages that are sent while they are offline. For this reason, we can say that the tops provide a guarantee of delivery. no more than once for every consumer.

Publish-Subscribe messaging is typically used when the messages are informational in nature and the loss of a single message is not particularly significant. For example, a topic can transmit temperature readings from a group of sensors once per second. A system that is interested in the current temperature and that subscribes to a topic will not worry if it misses a message - another will arrive soon.

hybrid models

The store website puts order messages into a "message queue". The main consumer of these messages is the executive system. In addition, the auditing system must have copies of these order messages for later tracking. Both systems cannot miss messages, even if the systems themselves are unavailable for some time. The website should not be aware of other systems.

Use cases often require a mix of publish-subscribe and point-to-point messaging models, such as when multiple systems need a copy of a message and both reliability and persistence are required to prevent message loss.

In these cases, a destination (general term for queues and topics) is required, which distributes messages basically like a topic, so that each message is sent to a separate system interested in these messages, but also in which each system can define several consumers that receive incoming messages, which is more like a queue. The type of reading in this case is βˆ’ once for each stakeholder. These hybrid destinations often require durability so that if a consumer disconnects, messages that are sent at that time are accepted when the consumer reconnects.

Hybrid models are not new and can be applied to most messaging systems, including both ActiveMQ (via virtual or composite destinations that combine topics and queues) and Kafka (implicitly, as a fundamental property of its destination design).

Now that we have some basic terminology and an understanding of what a messaging system could be useful for, let's get into the details.

Translation done: tele.gg/middle_java

Next translated part: Chapter 3. Kafka

To be continued ...

Source: habr.com

Add a comment