GitOps: Comparing Pull and Push Methods

Note. transl.: In the Kubernetes community, a trend called GitOps is gaining clear popularity, as we have personally seen, visiting KubeCon Europe 2019. This term was relatively recent coined the head of Weaveworks, Alexis Richardson, and means the use of tools familiar to developers (primarily Git, hence the name) to solve operational problems. In particular, we are talking about the operation of Kubernetes by storing its configurations in Git and automatically rolling out changes to the cluster. Matthias Jg talks about two approaches to this rollout in this article.

GitOps: Comparing Pull and Push Methods

Last year, (in fact, formally this happened in August 2017 - approx. transl.) there is a new approach to deploying applications in Kubernetes. It's called GitOps, and it's based on the basic idea that deployments are versioned in a secure Git repository environment.

The main advantages of this approach are the following:

  1. Deployment versioning and change history. The state of the entire cluster is stored in a Git repository, and deployments are updated only by commits. In addition, all changes can be tracked using the commit history.
  2. Rollbacks using familiar Git commands... Plain git reset allows you to reset changes in deployments; past states are always available.
  3. Ready access control. Usually, a Git system contains a lot of sensitive data, so most companies pay special attention to protecting it. Accordingly, this protection also applies to operations with deployments.
  4. Policies for Deployments. Most Git systems natively support per-branch policiesβ€”for example, only pull requests can update master, and the changes must be checked out and committed by another member of the team. As with access control, the same policies apply to updates to deployments.

As you can see, the GitOps method has many advantages. Two approaches have gained particular popularity over the past year. One is based on push, the other is based on pull. Before looking at them, let's first see what typical Kubernetes deployments look like.

Deployment methods

Over the past years, Kubernetes has established various ways and tools for deployment:

  1. Based on native Kubernetes/Kustomize templates. This is the easiest way to deploy applications to Kubernetes. The developer creates basic YAML files and applies them. To get rid of the constant rewriting of the same templates, Kustomize was developed (it turns Kubernetes templates into modules). Note. transl.: Kustomize has been integrated into kubectl with Kubernetes 1.14 release.
  2. Helm charts. Helm charts allow you to create sets of templates, init containers, sidecars, etc. that are used to deploy applications with more customization options than the template-based approach. This method is based on templated YAML files. Helm fills them in with various parameters and then sends them to Tiller, a cluster component that deploys them in the cluster and allows updates and rollbacks to be performed. The important thing is that, in essence, Helm simply inserts the desired values ​​into templates and then applies them in the same way as it is done in the traditional approach. (read more about how it all works and how you can use it in our article by Helm - approx. transl.). There is a wide variety of ready-made Helm charts covering a wide range of tasks.
  3. Alternative Tools. There are many alternative tools. What they all have in common is that they turn some template files into Kubernetes-readable YAML files and then apply them.

In our work, we constantly use Helm charts for important tools (because they have a lot already ready, which makes life much easier) and β€œpure” Kubernetes YAML files for deploying our own applications.

Pull & Push

In one of my recent blog posts, I introduced the tool Weave Flux, which allows you to commit templates to a Git repository and update deployment after each commit or push of the container. My experience is that this tool is one of the main tools for promoting the pull approach, so I will often refer to it. If you want to know more about how to use it, here article link.

NB! All the benefits of using GitOps are retained for both approaches.

Pull based approach

GitOps: Comparing Pull and Push Methods

The pull approach is based on the fact that all changes are applied from within the cluster. Inside the cluster, there is an operator who regularly checks the associated Git and Docker Registry repositories. If any changes occur in them, the state of the cluster is updated internally. It is generally considered that such a process is very secure, since no external client has access to cluster administrator rights.

Pros:

  1. No external client has the right to make changes to the cluster, all updates are rolled from within.
  2. Some tools also allow you to synchronize Helm chart updates and link them to a cluster.
  3. Docker Registry can be scanned for new versions. If a new image appears, the Git repository and deployment are updated to the new version.
  4. Pull tools can be distributed under different namespaces with different Git repositories and access rights. Thanks to this, a multitenant model can be applied. For example, team A might use namespace A, team B might use namespace B, and an infrastructure team might use the global namespace.
  5. As a rule, the tools are very lightweight.
  6. Combined with tools such as operator Bitnami Sealed Secrets, secrets can be stored encrypted in a Git repository and retrieved within the cluster.
  7. There is no connection to the CD pipelines because the deployments happen inside the cluster.

Cons:

  1. Managing deployment secrets from Helm charts is more difficult than regular ones, since they first have to be generated in the form of, say, sealed secrets, then decrypted by an internal operator, and only after that they become available to the pull tool. You can then release in Helm with the values ​​in the already deployed secrets. The easiest way is to create a secret with all the Helm values ​​used for deployment, decrypt it, and commit it to Git.
  2. With the pull approach, you are tied to pull tools. This limits the ability to customize the deployment process of deployments in a cluster. For example, working with Kustomize is complicated by the fact that it must be executed before the final templates arrive in Git. I'm not saying you can't use standalone tools, but it's harder to integrate them into your deployment process.

Push based approach

GitOps: Comparing Pull and Push Methods

In the push approach, the external system (mostly CD pipelines) triggers deployments to the cluster after a commit to a Git repository or if a previous CI pipeline succeeds. In this approach, the system has access to the cluster.

pros:

  1. Security is determined by the Git repository and build pipeline.
  2. Deploying Helm charts is easier, there is support for Helm plugins.
  3. Secrets are easier to manage because secrets can be used in pipelines and also stored encrypted in Git (depending on the user's preferences).
  4. Lack of binding to a specific instrument, since you can use any of their types.
  5. Container version updates can be triggered by the build pipeline.

Cons:

  1. The data for accessing the cluster is inside the build system.
  2. Updating container deployments is still easier with a pull process.
  3. Heavy dependence on the CD system, since the pipelines we need are probably originally written under Gitlab Runners, and then the team decides to move to Azure DevOps or Jenkins ... and a lot of build pipelines will have to be migrated.

Results: Push or Pull?

As usual, each approach has its pros and cons. Some tasks are easier with one and harder with the other. At first, I deployed manually, but after I came across a few articles about Weave Flux, I decided to implement GitOps processes for all projects. For basic templates, this was easy, but then I started to run into difficulties working with Helm charts. At the time, Weave Flux only offered a rudimentary version of the Helm Chart Operator, but even now, some tasks are more difficult due to the need to manually create secrets and apply them. You can say that the pull approach is much more secure, since the cluster credentials are not accessible outside of the cluster, which improves security so much that it's worth the extra effort.

After some thought, I came to the unexpected conclusion that this is not the case. If we talk about components that require maximum protection, this list will include secret repositories and CI / CD systems, Git repositories. The information inside them is very vulnerable and needs maximum protection. In addition, if someone breaks into your Git repository and can push code there, they can deploy whatever they want (regardless of the chosen approach, be it a pull or a push), and infiltrate the cluster systems. Thus, the most important components that need protection are the Git repository and CI/CD systems, not the cluster credentials. If you have well-configured policies and security measures for these types of systems, and cluster credentials are only pulled into the pipelines as secrets, the extra security of the pull approach may not be as valuable as originally thought.

So, if the pull approach is more labor intensive and does not provide a security benefit, wouldn't it make more sense to just use the push approach? But after all, someone may argue that in the push approach you are too tied to the CD system and perhaps it is better not to do this in order to make it easier to migrate in the future.

In my opinion (as always), you should use what is more suitable for a particular case or combine. Personally, I use both approaches: Weave Flux for pull-based deployments, which mostly include our own services, and the push approach with Helm and plugins, which makes it easy to apply Helm charts to a cluster and allows you to create secrets without problems. I think there will never be a single solution suitable for all cases, because there are always a lot of nuances and they depend on the specific application. At the same time, I highly recommend GitOps - it greatly simplifies life and improves security.

I hope my experience on this topic will help you decide which method is more suitable for your type of deployment, and I will be glad to hear your opinion.

PS Note from the translator

In the minuses of the pull model, there is a point about the fact that it is difficult to put rendered manifests into Git, but there is no minus that the CD pipeline in the pull model lives separately from the rollout and, in fact, becomes a category pipeline Continuous Apply. Therefore, even more efforts will be required in order to collect their status from all deployments and somehow give access to the logs / status, and preferably with reference to the CD system.

In this sense, the push model allows you to give at least some rollout guarantees, because the pipeline lifetime can be made equal to the rollout lifetime.

We tried both models and came to the same conclusions as the author of the article:

  1. The pull model suits us for organizing the update of system components on a large number of clusters (see Fig. article about addon-operator).
  2. The push model based on GitLab CI is well suited for rolling out applications using Helm charts. At the same time, rollout of deployments within pipelines is tracked using the tool yard. By the way, in the context of this project of ours, we heard the constant β€œGitOps” when we discussed the pressing problems of DevOps engineers at our booth at KubeCon Europe'19.

PPS from translator

Read also on our blog:

Only registered users can participate in the survey. Sign in, you are welcome.

Are you using GitOps?

  • Yes, the pull approach

  • Yes, push

  • Yes, pull + push

  • Yes, something else

  • No

30 users voted. 10 users abstained.

Source: habr.com

Add a comment