DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Docker Swarm, Kubernetes and Mesos are the most popular container orchestration frameworks. In his talk, Arun Gupta compares the following aspects of how Docker, Swarm, and Kubernetes work:

  • Local development.
  • Deployment features.
  • Multi-container applications.
  • Service discovery.
  • Service scaling.
  • Run-once jobs.
  • Integration with Maven.
  • Rolling update.
  • Create a Couchbase database cluster.

As a result, you will gain a solid understanding of what each orchestration tool has to offer and learn how to use these platforms effectively.

Arun Gupta is the Chief Technology Officer for open-source products at Amazon Web Services, who has been building the Sun, Oracle, Red Hat, and Couchbase developer communities for over 10 years. He has extensive experience in leading cross-functional teams developing and implementing strategies for marketing campaigns and programs. He has led Sun engineering teams, is a founding member of the Java EE team, and is the founder of the US Devoxx4Kids division. Arun Gupta is the author of over 2 posts on IT blogs and has spoken in over 40 countries.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 1
DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 2

Line 55 contains COUCHBASE_URI pointing to this database service, which is also created using the Kubernetes configuration file. If you look at line 2, you can see kind: Service is the service I'm creating under the name couchbase-service, and the same name is listed on line 4. Below are a few ports.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

The key lines are 6 and 7. In service I say "Hey, these are the labels I'm looking for!" and these labels are nothing but variable pair names, and line 7 points to my couchbase-rs-pod application. The following are the ports that give access to these same labels.

On line 19 I create a new ReplicaSet type, line 31 contains the name of the image, and lines 24-27 point to the metadata associated with my pod. This is exactly what service is looking for and what the connection needs to be established with. At the end of the file is some kind of connection between lines 55-56 and 4, saying "use this service!".

So, I start my service with a replica set, and since each replica set has its own port with the appropriate label, it is included in the service. From a developer's point of view, you simply access the service, which then uses the replica set you need.

I ended up with a WildFly pod that communicates with the database backend via the Couchbase Service. I can use the WildFly multi-pod frontend, which also communicates with the couchbase backend through the couchbase service.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Later we will look at how a service located outside the cluster, through its IP address, communicates with elements that are located inside the cluster and have an internal IP address.

So stateless containers are great, but how good is the idea of ​​using stateful containers? Let's look at system settings for stateful, or persistent containers. In Docker, there are 4 different approaches to the location of the data store that you should pay attention to. The first is Implicit Per-Container, which means that when using couchbase, MySQL, or MyDB satateful containers, they all start with the default Sandbox. That is, everything that is stored in the database is stored in the container itself. If the container goes missing, the data goes with it.

The second is Explicit Per-Container, when you create a specific storage with the docker volume create command and store data in it. The third Per-Host approach is related to storage mapping, when everything that is stored in the container is simultaneously duplicated on the host. If the container fails, the data will remain on the host. The last is the use of several Multi-Host hosts, which is advisable at the production stage of various solutions. Suppose your containers with your applications are running on a host, but at the same time you want to store your data somewhere on the Internet, and automatic mapping for distributed systems is used for this.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Each of these methods uses a specific storage location. Implicit and Explicit Per-Container store data on the host at /var/lib/docker/volumes. When using the Per-Host method, the storage is mounted inside the container, and the container itself is mounted on the host. For multi-host solutions such as Ceph, ClusterFS, NFS, etc. can be used.

When a persistent container fails, the storage directory becomes inaccessible in the first two cases, and in the last two cases, access is preserved. However, in the first case, you can access the repository through a Docker host running in a virtual machine. In the second case, the data will not be lost either, because you have created an Explicit storage.

When the host fails, the repository directory is unavailable in the first three cases, in the latter case, communication with the repository is not interrupted. Finally, the shared function is completely excluded for storage in the first case and possible in the rest. In the second case, you can share storage depending on whether your database supports distributed storage or not. In the case of Per-Host, distribution of data is possible only on a given host, and for a multi-host, it is provided by the cluster extension.

This should be taken into account when creating stateful containers. Another useful Docker tool is the Volume plugin, which works on the β€œbatteries are present but must be replaced” principle. When starting a Docker container, it says "Hey, by running a container with a database, you can store your data in this container!" This is the default feature, but you can change it. This plugin allows you to use a network drive or something similar instead of a container database. It includes a default host-based storage driver and allows container integration with external storage systems such as Amazon EBS, Azure Storage, and GCE Persistent disks.

The next slide shows the architecture of the Docker Volume plugin.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Blue indicates a Docker client associated with a blue Docker host that has a Local storage engine that provides you with storage containers. Green indicates the Plugin Client and Plugin Daemon, which are also connected to the host. They provide the ability to store data in network storages of the type you need Storage Backend.

The Docker Volume plugin can be used with Portworx storage. The PX-Dev module is actually a container that you run that connects to a Docker host and makes it easy to store data on Amazon EBS.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

The Portworx client allows you to monitor the status of various storage containers that are connected to your host. If you visit my blog, you can read how to get the most out of Portworx with Docker.

The concept of storage in Kubernetes is similar to Docker and is represented by directories that are available to your container in a pod. They are independent of the lifetime of any container. The most common storage types available are hostPath, nfs, awsElasticBlockStore and gsePersistentDisk. Let's take a look at how these storages work in Kubernetes. Usually the process of connecting them consists of 3 steps.

The first is that someone on the network side, usually an administrator, provides you with persistent storage. There is a corresponding PersistentVolume configuration file for this. Next, the application developer writes a configuration file called PersistentVolumeClaim, or a request for PVC storage, which says: β€œI have 50GB of distributed storage provisioned, but so that other people can also use its capacity, I inform in this PVC that I now need only 10 GB". Finally, the third step is that your query is mounted as a store, and the application that has a pod or replica set or something like that starts using it. It is important to remember that this process consists of the 3 steps mentioned and allows for scaling.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

The next slide shows the AWS architecture's Kubernetes Persistence Container.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Inside the brown rectangle that represents the Kubernetes cluster, there is one master node and two worker nodes, shown in yellow. One of the worker nodes contains an orange pod, a repository, a replica controller, and a green Docker Couchbase container. Inside the cluster, above the nodes, a purple rectangle indicates an externally accessible Service. This architecture is recommended for storing data on the device itself. If necessary, I can store my data in EBS outside the cluster, as shown on the next slide. This is a typical model for scaling, however, when applying it, you need to take into account the financial aspect - storing data somewhere on the network can be more expensive than on the host. When choosing containerization solutions, this is one of the weighty arguments.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Just like with Docker, you can use persistent Kubernetes containers with Portworx.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

This is what in current Kubernetes 1.6 terminology is called a β€œStatefulSet”, a way of working with Stateful applications that handle Pod stop and Graceful Shutdown events. In our case, such applications are databases. On my blog, you can read how to create a StatefulSet in Kubernetes using Portworx.
Let's talk about the development aspect. As I said, Docker has 2 versions - CE and EE, in the first case we are talking about the stable version of the Community Edition, which is updated every 3 months, unlike the monthly updated version of EE. You can download Docker for Mac, Linux or Windows. Once installed, Docker will be automatically updated and it's very easy to get started with it.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

In Kubernetes, I prefer the Minikube version - this is a good way to get started with this platform by creating a cluster on a single node. To create clusters from several nodes, the choice of versions is wider: these are kops, kube-aws (CoreOS + AWS), kube-up (outdated). If you're looking to use AWS-based Kubernetes, I recommend joining the AWS SIG, which meets online every Friday and posts a variety of interesting content on working with AWS Kubernetes.

Let's take a look at how Rolling Update works on these platforms. If there is a cluster of several nodes, then it uses a specific version of the image, for example, WildFly:1. A rolling update means that the image version is replaced with a new one sequentially on each node, one by one.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

For this, the command docker service update (service name) is used, in which I specify the new version of the WildFly: 2 image and the update-parallelism 2 update method. The number 2 means that the system will update 2 application images at a time, followed by a 10-second update delay 10s, after which the next 2 images will be updated on 2 more nodes, etc. This simple rolling update mechanism is provided to you as part of Docker.

In Kubernetes, rolling refresh works like this. The replication controller rc creates a replica set of one version, and each pod in this webapp-rc is labeled with a label found in etcd. When I need some kind of pod, I use the Application Service to access the etcd repository, which, by the specified label, provides me with this pod.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

In this case, we have 3 pods in the Replication controller running the WildFly version 1 application. When updating in the background, another replication controller is created with the same name and index at the end - - xxxxx, where x are random numbers, and with the same labels. The Application Service now has three Pods with the old version of the application and three Pods with the new version in the new Replication controller. After that, the old pods are deleted, the replication controller with the new pods is renamed and put into operation.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Let's move on to monitoring. Docker has many built-in monitoring commands. For example, the docker container stats command line interface allows you to display data on the status of containers every second to the console - the use of the processor, disk, network load. The Docker Remote API tool provides data on how the client communicates with the server. It uses simple commands but is based on the Docker REST API. In this case, the words REST, Flash, Remote mean the same thing. When you communicate with the host, it's a REST API. The Docker Remote API allows you to get more information about running containers. My blog has the details of using this monitoring with Windows Server.

Monitoring system events docker system events when starting a multi-host cluster makes it possible to obtain data about a host crash or a container crash on a specific host, service scaling, and the like. Starting with Docker 1.20, it includes Prometheus, which embeds endpoints into existing applications. This allows you to receive metrics over HTTP and display them in dashboards.

Another monitoring feature is cAdvisor (short for container Advisor). It analyzes and provides resource usage and performance data from running containers, providing Prometheus metrics out of the box. The peculiarity of this tool is that it only provides data for the last 60 seconds. Therefore, you need to provide the ability to collect this data and place it in a database in order to be able to monitor a long-term process. It can also be used to graphically display metrics on a dashboard using Grafana or Kibana. My blog has a detailed description of how to use cAdvisor to monitor containers using the Kibana dashboard.

The next slide shows what the result of the Prometheus endpoint looks like and the metrics available for display.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

At the bottom left you see the metrics of HTTP requests, responses, etc., on the right - their graphical display.

Kubernetes also contains built-in monitoring tools. This slide shows a typical cluster containing one master and three worker nodes.

DEVOXX UK Conference. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3

Each of the worker nodes contains an automatically launched cAdvisor. In addition, there is Heapster, a performance monitoring and metrics system that is compatible with Kubernetes version 1.0.6 and higher. Heapster allows you to collect not only the performance of workloads, modules and containers, but also events and other signals generated by the entire cluster. To collect data, it communicates with the Kubelet of each Pod, automatically stores the information in the InfluxDB database and displays it as metrics in the Grafana dashboard. However, please note that if you are using miniKube, this feature is not available by default, so you will have to use addons for monitoring. So it all depends on where you run containers and which monitoring tools you can use by default and which you need to install as separate add-ons.

The next slide shows the Grafana dashboards that show the operational status of my containers. There is a lot of interesting data here. Of course, there are many commercial Docker and Kubernetes process monitoring tools, such as SysDig, DataDog, NewRelic. Some of them have a 30 free trial period, so you can try and find the one that suits you best. Personally, I prefer to use SysDig and NewRelic, which integrate well with Kubernetes. There are tools that integrate equally well into both Docker and Kubernetes platforms.

Some ads πŸ™‚

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Add a comment