3-way merge in werf: deploy to Kubernetes with Helm “on steroids”

Something that we (and not only we) have been waiting for a long time has happened: yard, our Open Source utility for building applications and delivering them to Kubernetes, now supports applying changes using 3-way merge patches! In addition to this, it is now possible to adopt existing K8s resources into Helm releases without recreating those resources.

3-way merge in werf: deploy to Kubernetes with Helm “on steroids”

In short, we put WERF_THREE_WAY_MERGE=enabled - we get the deployment "as in kubectl apply”, compatible with existing installations on Helm 2 and even a little more.

But let's start with a theory: what are 3-way merge patches anyway, how did people come up with the approach of generating them, and why are they important in CI / CD processes with Kubernetes-based infrastructure? And after that, let's see what 3-way-merge is in werf, what modes are used by default and how to manage it.

What is a 3-way merge patch?

So, let's start with the task of rolling out the resources described in the YAML manifests to Kubernetes.

To work with Kubernetes resources, the API offers the following basic operations: create, patch, replace, and delete. It is assumed that with their help it is necessary to construct a convenient continuous rollout of resources to the cluster. How?

kubectl imperative commands

The first approach to managing objects in Kubernetes is to use kubectl imperative commands to create, modify, and delete those objects. Simply put:

  • the team kubectl run you can run Deployment or Job:
    kubectl run --generator=deployment/apps.v1 DEPLOYMENT_NAME --image=IMAGE
  • the team kubectl scale — change the number of replicas:
    kubectl scale --replicas=3 deployment/mysql
  • etc.

This approach may seem convenient at first glance. However, there are problems:

  1. Its hard automate.
  2. Как reflect configuration in git? How to review changes happening to the cluster?
  3. How to ensure reproducibility configuration on restart?
  4. ...

It is clear that this approach does not go well with storing together with the application code and infrastructure as code (IaC; or even gitops as a more modern option that is gaining popularity in the Kubernetes ecosystem). Therefore, these commands in kubectl did not receive further development.

create, get, replace, and delete operations

with primary by creation it's simple: send the manifest to the operation create kube api and resource created. The YAML representation of the manifest can be stored in Git and created using the command kubectl create -f manifest.yaml.

С removal is also simple: we substitute the same manifest.yaml from git to command kubectl delete -f manifest.yaml.

Operation replace allows you to completely replace the resource configuration with a new one, without recreating the resource. This means that before making a change to a resource, it is logical to request the current version with the operation get, change it and update it with the operation replace. kube apiserver built in optimistic locking and if after surgery get the object has changed, then the operation replace will not pass.

To store the configuration in Git and update using replace, you need to do the operation get, merge the config from Git with what we got, and execute replace. By default, kubectl only allows you to use the command kubectl replace -f manifest.yamlWhere manifest.yaml - an already fully prepared (in our case, merged) manifest that needs to be installed. It turns out that the user needs to implement merge manifests, and this is not a trivial matter ...

It is also worth noting that although manifest.yaml and is stored in Git, we cannot know in advance whether to create an object or update it - this must be done by user software.

Total: can we build a continuous rollout only with create, replace and delete, ensuring that the infrastructure configuration is stored in Git along with the code and convenient CI / CD?

In principle, we can ... For this you need to implement the merge operation manifests and some kind of binding that:

  • checks for the presence of an object in the cluster,
  • performs initial resource creation,
  • updates or deletes it.

When updating, keep in mind that resource may have changed. since the last get and automatically handle the case of optimistic locking - make repeated attempts to update.

However, why reinvent the wheel when kube-apiserver offers another way to update resources: patch, which removes some of the described problems from the user?

Patch

So we got to the patches.

Patches are the primary way to apply changes to existing objects in Kubernetes. Operation patch works like this:

  • the kube-apiserver user needs to send a patch in JSON form and specify an object,
  • and apiserver will deal with the current state of the object and bring it to the required form.

Optimistic locking is not required in this case. This operation is more declarative than replace, although it may seem the other way around at first.

Thus:

  • through the operation create we create an object according to the manifest from Git,
  • through delete - delete if the object is no longer required,
  • through patch - we change the object, bringing it to the form described in Git.

However, to do this, you need to create correct patch!

How patches work in Helm 2: 2-way-merge

When installing a release for the first time, Helm performs an operation create for chart resources.

When updating a Helm release for each resource:

  • considers a patch between the resource version from the previous chart and the current chart version,
  • applies this patch.

We will call such a patch 2 way merge patch, because 2 manifests are involved in its creation:

  • resource manifest from a previous release,
  • resource manifest from the current resource.

When removing the operation delete in kube, apiserver is called for resources that were announced in a previous release but not announced in the current one.

The 2 way merge patch approach has a problem: it leads to desynchronization of the real state of the resource in the cluster and the manifest in Git.

Illustration of the problem with an example

  • In Git, a chart stores a manifest in which the field image Deployment matters ubuntu:18.04.
  • User via kubectl edit changed the value of this field to ubuntu:19.04.
  • When re-deploying a Helm chart does not generate a patch, because the field image in the previous version of the release and in the current chart are the same.
  • After redeployment image remains ubuntu:19.04, although the chart says ubuntu:18.04.

We got desynchronization and lost declarativeness.

What is a synchronized resource?

Generally speaking, complete it is not possible to get a match between a resource manifest in a running cluster and a manifest from Git. Because in a real manifest there may be service annotations / labels, additional containers and other data added and removed from the resource dynamically by some kind of controllers. We cannot and do not want to keep this data in Git. However, we want the fields that we explicitly specified in Git to take on the appropriate values ​​when we roll out.

It turns out such a general synchronized resource rule: when checking out a resource, you can change or remove only those fields that are explicitly specified in the manifest from Git (or were specified in the previous version, and now removed).

3 way merge patch

main idea 3 way merge patch: we generate a patch between the last applied version of the manifest from Git and the target version of the manifest from Git, taking into account the current version of the manifest from the running cluster. The resulting patch must comply with the synchronized resource rule:

  • new fields added to the target version are added with a patch;
  • fields that previously existed in the last applied version and do not exist in the target version are reset using the patch;
  • fields in the current version of the object that differ from the target version of the manifest are updated with the patch.

It is by this principle that it generates patches kubectl apply:

  • the last applied version of the manifest is stored in the annotation of the object itself,
  • target - taken from the specified YAML file,
  • the current one is from a running cluster.

Now that we have dealt with the theory, it's time to tell what we did in werf.

Applying changes in werf

Previously, werf, like Helm 2, used 2-way-merge patches.

repair patch

In order to switch to a new type of patches - 3-way-merge - the first step we introduced the so-called repair patches.

When deploying, a standard 2-way-merge patch is used, but werf additionally generates a patch that would synchronize the real state of the resource with what is written in Git (such a patch is created using the same synchronized resource rule described above).

In the event of an out of sync, at the end of the deployment, the user receives a WARNING with the appropriate message and a patch that must be applied to bring the resource to a synchronized form. Also, this patch is written in a special annotation werf.io/repair-patch. It is assumed that the user himself will apply this patch: werf will not apply it in principle.

The generation of repair patches is a temporary measure that allows you to actually test the creation of patches according to the 3-way-merge principle, but these patches are not automatically applied. At the moment, this mode of operation is enabled by default.

3-way-merge patch for new releases only

Starting December 1, 2019, beta and alpha versions of werf begin by default use full 3-way-merge-patches to apply changes only for new Helm releases rolled out via werf. Existing releases will continue to use the 2-way-merge + repair-patch approach.

This mode of operation can be explicitly enabled by setting WERF_THREE_WAY_MERGE_MODE=onlyNewReleases now.

Note: the feature appeared in werf for several releases: in the alpha channel it became ready from the version v1.0.5-alpha.19, and in the beta channel - with v1.0.4-beta.20.

3-way-merge patch for all releases

Starting December 15, 2019, beta and alpha versions of werf start using full 3-way merge patches by default to apply changes for all releases.

This mode of operation can be explicitly enabled by setting WERF_THREE_WAY_MERGE_MODE=enabled now.

What about autoscaling resources?

There are 2 types of autoscaling in Kubernetes: HPA (horizontal) and VPA (vertical).

Horizontal automatically selects the number of replicas, vertical - the number of resources. Both the number of replicas and the resource requirements are specified in the resource manifest (see Resource Manifest). spec.replicas or spec.containers[].resources.limits.cpu, spec.containers[].resources.limits.memory и others).

Problem: if a user configures a resource in the chart so that it contains certain values ​​for resources or replicas and autoscalers are enabled for this resource, then with each deployment, werf will reset these values ​​to what is written in the chart manifest.

There are two solutions to the problem. For starters, it's best to avoid explicitly specifying autoscaled values ​​in the chart manifest. If this option is not suitable for some reason (for example, because it is convenient to set initial resource limits and the number of replicas in the chart), then werf offers the following annotations:

  • werf.io/set-replicas-only-on-creation=true
  • werf.io/set-resources-only-on-creation=true

With such an annotation, werf will not reset the corresponding values ​​on each deploy, but will only set them when the resource is initially created.

For details, see the project documentation for HPA и VPA.

Disable 3-way-merge patch

The user can still disable the use of new patches in werf using an environment variable WERF_THREE_WAY_MERGE_MODE=disabled. However, starting From March 1, 2020, this ban will no longer work. and it will only be possible to use 3-way-merge-patches.

Adoption of resources in werf

Mastering the method of applying changes with 3-way-merge-patches allowed us to immediately implement such a feature as the adoption of resources existing in the cluster in the Helm release.

Helm 2 has a problem: you cannot add a resource that already exists in the cluster to the chart manifests without re-creating this resource from scratch (see section XNUMX. #6031, #3275). We taught werf to accept existing resources in release. To do this, you need to install an annotation on the current version of the resource from the running cluster (for example, using kubectl edit):

"werf.io/allow-adoption-by-release": RELEASE_NAME

Now the resource needs to be described in the chart, and the next time the werf deploys the release with the corresponding name, the existing resource will be accepted into this release and remain under its control. Moreover, in the process of accepting a resource for release, werf will bring the current state of the resource from the running cluster to the state described in the chart using the same 3-way-merge patches and the synchronized resource rule.

Note: setting WERF_THREE_WAY_MERGE_MODE does not affect the adoption of resources - in the case of adoption, a 3-way-merge patch is always used.

Details - in documentation.

Conclusions and future plans

I hope that after this article it became clearer what 3-way-merge-patches are and why they came to them. From a practical point of view of the development of the werf project, their implementation was another step towards improving the Helm-like deployment. Now you can forget about the problems with configuration synchronization that often arose when using Helm 2. At the same time, a new useful feature of adoption of already defunct Kubernetes resources was added to the Helm release.

There are still some problems and difficulties in Helm-like deployments, such as the use of Go templates, and we will continue to solve them.

Information about resource update methods and adoption can also be found at this documentation page.

Helm 3

Worthy of special note released just the other day a new major version of Helm - v3 - which also uses 3-way-merge-patches and gets rid of Tiller. The new version of Helm requires of migration existing installations to convert them to the new release storage format.

Werf, for its part, has already gotten rid of the use of Tiller, switched to 3-way-merge and added more, while remaining compatible with existing Helm 2 installations (no migration scripts needed). Therefore, as long as werf is not switched to Helm 3, werf users do not lose the main advantages of Helm 3 over Helm 2 (werf also has them).

However, switching werf to the Helm 3 codebase is inevitable and will happen in the near future. Presumably this will be werf 1.1 or werf 1.2 (at the moment, the main version of werf is 1.0; for more information about the werf versioning device, see here). During this time, Helm 3 will have time to stabilize.

PS

Read also on our blog:

Source: habr.com

Add a comment