Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

I suggest that you familiarize yourself with the transcript of Alexander Sigachev's report Service Discovery in distributed systems using Consul as an example.

Service Discovery was created so that with minimal cost you can connect a new application to our existing environment. Using Service Discovery, we can maximally separate either a container in the form of a docker or a virtual service from the environment in which it is running.

I welcome everyone! I am Alexander Sigachev, I work for Inventos. And today I will introduce you to such a concept as Service Discovery. We will consider Service Discovery using Consul as an example.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

What problems does Service Discovery solve? Service Discovery was created so that with minimal cost you can connect a new application to our existing environment. Using Service Discovery, we can maximally separate either a container in the form of a docker or a virtual service from the environment in which it is running.

What does it look like? In a classic example on the web, this is a frontend that receives a user request. It then routes it to the backend. In this example, this load-balancer balances two backends.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

Here we see that we are starting the third instance of the application. Accordingly, when the application starts, it registers with Service Discovery. Service Discovery notifies load-balancer. Load-balancer changes its config automatically and the new backend is already connected to work. Thus, backends can be added, or, conversely, excluded from work.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

What else is convenient to do with Service Discovery? Service Discovery can store nginx configs, certificates, and a list of active backend servers.

Service Discovery in distributed systems on the example of Consul. Alexander SigachevService Discovery also allows you to detect a failure, detect failures. What are the possible schemes when failures are detected?

  • This application we have developed notifies Service Discovery itself that it is still operational.
  • Service Discovery, for its part, polls the application for availability.
  • Or, a third-party script or application is used that checks our application for availability and notifies Service Discovery that everything is fine and can work, or, conversely, that everything is bad and this instance of the application needs to be excluded from balancing.

Each of the schemes can be applied depending on what software we use. For example, we just started to develop a new project, then we can easily provide a scheme for when our application notifies Service Discovery. Or we can connect that Service Discovery is checking.

If the application was inherited by us or developed by someone else, then the third option is suitable here, when we write a handler, and all this gets into our work automatically.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

This is one example. Load-balancer in the form of nginx is reloaded. This is an optional utility that comes with Consul. This is consul-template. We describe the rule. We say that we use a template (Golang template engine). When events occur, when notifications that changes have occurred, it is regenerated and the “reload” command is sent to Service Discovery. The simplest example is when nginx is reconfigured on an event and restarted.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

What is Consul?

  • First of all, it is Service Discovery.

  • It has an availability check mechanism - Health Checking.

  • He also has a KV Store.

  • And it is based on the ability to use Multi Datacenter.

What can all this be used for? In KV Store we can store config examples. Health Checking we can check the local service and notify. Multi Datacenter is used in order to be able to build a map of services. For example, Amazon has several zones and routes traffic in the most optimal way so that there are no unnecessary requests between data centers, which are charged separately from local traffic, and, accordingly, have a lower delay.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

Let's look a little at the terms that are used in Consul.

  • Consul is a service written in Go. One of the advantages of a Go program is 1 binary file that you just downloaded. Launched from anywhere and you have no dependencies.
  • Further, using the keys, we can start this service either in client mode or in server mode.
  • Also, the "datacenter" attribute allows you to set a flag to which data center this server belongs.
  • Consensus - based on the raft protocol. If anyone is interested, you can read more about this on the Consul website. This is a protocol that allows you to determine the leader and determine which data is considered valid and available.
  • Gossip is a protocol that allows communication between nodes. Moreover, this system is decentralized. Within one data center, all nodes communicate with their neighbors. And, accordingly, information about the current state is transmitted to each other. We can say that this is gossip between neighbors.
  • LAN Gossip - local data exchange between neighbors within the same data center.
  • WAN Gossip - used when we need to synchronize information between two data centers. Information goes between nodes that are marked as a server.
  • RPC - allows you to make requests through the client on the server.

Description of RPC. Let's say Consul is running as a client on a virtual machine or physical server. We address it locally. And then the local client requests information from the server and synchronizes. Information, depending on the settings, can be issued from the local cache, or can be synchronized with the leader, with the server master.

These two schemes have both pluses and minuses. If we are working with a local cache, then this is fast. If we work with data that is stored on the server, then it takes longer, but we get more up-to-date information.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

If this is depicted graphically, then here is a picture of the site. We see that we have three masters running. One is marked with an asterisk as the leader. In this example, there are three clients that communicate with each other locally via UDP/TCP. And information between data centers is transferred between servers. Here, clients interact with each other locally.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

What API does Consul provide? In order to get information, Consul has two kinds of APIs.

This is the DNS API. By default, Consul runs on port 8600. We can configure request proxying and provide access through local resolving, through local DNS. We can query by domain and get in response information about the IP address.

HTTP API - or we can request information about a specific service locally on port 8500 and get a JSON response, what IP the server has, what host, what port is registered. And additional information can be passed via token.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

What do you need to run Consul?

In the first option, we specify the flag in the developer mode that this is the developer mode. Agent starts as a server. And it performs the entire function on its own on one machine. Convenient, fast and practically no additional settings are required for the first start.

The second mode is running in production. This is where launching gets a bit tricky. If we do not have any version of the consul, then we must bring the first machine into bootstrap, i.e. this machine, which will take over the duties of the leader. We raise it, then we raise the second instance of the server, passing it information where we have the master. Raise the third. After we have three machines up, we are on the first machine from running bootstrap, restarting it in normal mode. The data is synced and the initial cluster is already up.

It is recommended that you run three to seven instances in server mode. This is due to the fact that if the number of servers grows, then the time to synchronize information between them increases. The number of nodes must be odd to provide a quorum.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

How are Health Checks provided? In the Consul configuration directory, we write a validation rule in the form of Json. The first option is the availability in this example of the google.com domain. And we say that after an interval of 30 seconds, you need to perform this check. Thus, we check that our node has access to the external network.

The second option is to test yourself. We use the usual curl to pull localhost on the specified port with an interval of 10 seconds.

These checks are summarized and fed into Service Discovery. Based on availability, these nodes are either excluded or appear in the list of available and correctly working machines.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

Consul also provides a UI interface that is launched with a separate flag and will be available on the machine. This allows you to view information, and you can also make some changes.

In this example, the Tools tab is open. Three services are shown running, one of them is Consul. The number of checks performed. And there are three data centers in which the machines are located.

Service Discovery in distributed systems on the example of Consul. Alexander Sigachev

This is an example of the "Nodes" tab. We see that they have compound names with the participation of data centers. It also shows which services are running, i.e. we see that no tags are set. In these additional tags, you can set some information that the developer can use to specify additional parameters.

You can also send information to Consul about the status of the disks, about the average load.

Questions

Question: We have a docker container, how to use it with Consul ?

Answer: There are several approaches for a docker container. One of the most common is to use a third-party docker container responsible for registration. At startup, a docker socket is thrown to it. All events for registering and depublishing a container are logged into Consul.

Q: So does Consul itself run the docker container?

Answer: No. We are running a docker container. And when configuring, we specify - listen to such and such a socket. This is about the same as working with a certificate when we pass information on where and what we have.

Question: It turns out that inside the docker container that we are trying to connect to Service Discovery there must be some kind of logic that can give data to Consul?

Answer: Not exactly. When it starts, we pass variables through the variable environment. Let's say service name, service port. It listens to this information in the register and enters it into Consul.

Question: I have another UI question. We have deployed the UI, for example, on a production server. What about security? Where is the data stored? Is there any way to accumulate data?

Answer: In the UI, just data from the database and from Service Discovery. We set passwords in the settings ourselves.

Question: Can this be published on the Internet?

Answer: Consul starts on localhost by default. To publish to this Internet, you will need to put some kind of proxy. We are responsible for the safety rules ourselves.

Question: Does it give historical data out of the box? It is interesting to see the statistics on Health Checks. You can also diagnose problems if the server crashes frequently.

Answer: I'm not sure that there are details of the checks.

Question: The current state is not so much important as the dynamics are important.

Answer: For analysis, yes.

Question: Is it better not to use Service Discovery for Consul docker?

Answer: I would not recommend using it. The purpose of the report is to introduce what such a concept is. Historically, he has come a long way, in my opinion, to the 1st version. Now there are already more complete solutions, for example, Kubernetes, which has all this under the hood. As part of Kubernetes Service Discovery is inferior to Etcd. But I'm not as familiar with him as I am with Consul. Therefore, I decided to make Service Discovery using Consul as an example.

Question: Does the scheme with the server leader slow down the start of the application as a whole? And how does Consul determine a new leader if this one is lying?

Answer: They have described a whole protocol. If you are interested, you can read.

Question: Consul acts as a full-fledged server and all requests fly through it?

Answer: It does not act as a full-fledged server, but takes a certain zone. It usually ends with service.consul. And then we go logically. We do not use domain names in production, namely the internal infrastructure, which is usually hidden behind server caching if we work via DNS.

Question: That is, if we want to access the database, then in any case we will pull Consul to find this database first, right?

Answer: Yes. If we work on DNS, then it works like without Consul when we use DNS names. Usually, modern applications do not pull the domain name in every request, because we have installed connect, everything works and we practically do not use it in the near future. If connect is broken, then - yes, we again ask where our base is and go to it.

Chat on hashicorp products — Chat of Hashicorp users: Consul, Nomad, Terraform

PS Concerning health checks. Consul, like Kubernetes, uses the same system for checking the health status of a service based on code status.

200 OK for healthy
503 Service Unavailable for unhealthy

Sources:
https://www.consul.io/docs/agent/checks.html
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
https://thoslin.github.io/microservice-health-check-in-kubernetes/

Source: habr.com

Add a comment