Safety Helms

The essence of the story about the most popular package manager for Kubernetes could be depicted using emoji:

  • box is Helm (this is the most appropriate thing in the latest release of Emoji);
  • lock - security;
  • man is the solution to the problem.

Safety Helms

In fact, everything will be a little more complicated, and the story is full of technical details about how how to make Helm secure.

  • Briefly, what is Helm, if you did not know or forgot. What problems does it solve and where is it located in the ecosystem.
  • Consider the Helm architecture. No conversation about security and how to make a tool or solution more secure is complete without understanding the architecture of the component.
  • Let's discuss Helm components.
  • The most burning question is the future - the new version of Helm 3. 

Everything in this article applies to Helm 2. This version is currently in production and is most likely the one you are currently using, and it is the one with security risks.


About Speaker: Alexander Khayorov (allexx) has been developing for 10 years, helping to improve content Moscow Python Conf++ and joined the committee Helm Summit. Now he works in Chainstack as a development lead - this is a hybrid between a development manager and a person who is responsible for the delivery of final releases. That is, it is located at the site of hostilities, where everything happens from the creation of a product to its operation.

Chainstack is a small, actively growing startup whose mission is to enable customers to forget about the infrastructure and complexities of operating decentralized applications, the development team is located in Singapore. Don't ask Chainstack to sell or buy cryptocurrency, but offer to talk about enterprise blockchain frameworks and they will gladly answer you.

Helmet

This is a package (chart) manager for Kubernetes. The clearest and most versatile way to bring applications to a Kubernetes cluster.

Safety Helms

This, of course, is about a more structural and industrial approach than creating your own YAML manifests and writing small utilities.

Helm is the best that is now available and popular.

Why Helm? Primarily because it is supported by CNCF. Cloud Native is a large organization and is the parent company for Kubernetes, etcd, Fluentd and others.

Another important fact is that Helm is a very popular project. When in January 2019 I first thought of talking about how to make Helm secure, the project had a thousand stars on GitHub. By May there were 12 of them.

Many people are interested in Helm, so even if you don't use it yet, knowing about its security will be useful to you. Safety is important.

Helm's core team is backed by Microsoft Azure, and as such it's a fairly stable project, unlike many others. The release of Helm 3 Alpha 2 in mid-July indicates that quite a lot of people are working on the project, and they have the desire and strength to develop and improve Helm.

Safety Helms

Helm addresses several root application management problems in Kubernetes.

  • Application packaging. Even an application like "Hello, World" on WordPress already consists of several services, and they want to be packaged together.
  • Managing the complexity that comes with managing these applications.
  • A life cycle that does not end after the application is installed or deployed. It continues to live, it needs to be updated, and Helm helps in this and tries to bring the right measures and policies for this.

Packing arranged in an understandable way: there is metadata in full accordance with the work of a conventional package manager for Linux, Windows or MacOS. That is, the repository, dependencies on various packages, meta-information for applications, settings, configuration features, indexing information, etc. All this Helm allows you to get and use for applications.

Complexity management. If you have many applications of the same type, then parameterization is needed. Templates follow from this, but in order not to come up with your own way of creating templates, you can use what Helm offers out of the box.

Application Lifecycle Management - in my opinion, this is the most interesting and unresolved issue. This is why I came to Helm at the time. We needed to keep track of the application life cycle, we wanted to move our CI / CD and application cycles into this paradigm.

Helm allows you to:

  • manage deployments, introduces the concept of configuration and revision;
  • successfully carry out rollback;
  • use hooks for different events;
  • add additional application checks and respond to their results.

Moreover Helm has "batteries" - a huge number of delicious things that can be included in the form of plugins, simplifying your life. Plugins can be written independently, they are quite isolated and do not require a coherent architecture. If you want to implement something, I recommend doing it as a plugin, and then possibly including it in the upstream.

Helm is based on three main concepts:

  • Chart Repos β€” description and array of parameterization possible for your manifest. 
  • Config β€”that is, the values ​​to be applied (text, numerical values, etc.).
  • Release collects the two upper components, and together they turn into Release. Releases can be versioned, thereby achieving the organization of the life cycle: small at the time of installation and large at the time of upgrade, downgrade or rollback.

Helm architecture

The diagram conceptually depicts the high-level architecture of Helm.

Safety Helms

Let me remind you that Helm is something related to Kubernetes. Therefore, we cannot do without a Kubernetes cluster (rectangle). The kube-apiserver component is on the master. Without Helm, we have Kubeconfig. Helm brings one small binary, if you can call it that, Helm CLI utility, which is installed on a computer, laptop, mainframe - anything.

But this is not enough. Helm has a server component called Tiller. He represents Helm within the cluster, and is an application within a Kubernetes cluster like any other.

The next Chart Repo component is a repository with charts. There is an official repository, and there may be a private repository of a company or project.

Interaction

Let's look at how the architecture components interact when we want to install an application using Helm.

  • We talk Helm install, access the repository (Chart Repo) and get the Helm chart.

  • The Helm utility (Helm CLI) interacts with Kubeconfig to figure out which cluster to refer to. 
  • Having received this information, the utility addresses Tiller, which is located in our cluster, already as an application. 
  • Tiller calls Kube-apiserver to perform actions in Kubernetes, create some objects (services, pods, replicas, secrets, etc.).

Next, we will complicate the scheme in order to see the attack vector that the Helm architecture as a whole can be exposed to. And then we'll try to protect her.

Attack vector

The first potential weakness is privileged APIβ€”user. As part of the scheme, this is a hacker who has gained admin access to the Helm CLI.

Unprivileged API User can also be a danger if it is somewhere nearby. Such a user will have a different context, for example, it can be fixed in one cluster namespace in the Kubeconfig settings.

The most interesting attack vector may be a process that is inside the cluster somewhere near Tiller and can access it. This can be a web server or a microservice that sees the network environment of the cluster.

An exotic, but increasingly popular, variant of the attack is related to Chart Repo. A chart created by an unscrupulous author may contain an unsafe resource, and you will follow it on faith. Or he can replace the chart that you download from the official repository, and, for example, create a resource in the form of policies and escalate access to himself.

Safety Helms

Let's try to fight off attacks from all these four sides and figure out where there are problems in the Helm architecture, and where, perhaps, they are not.

Let's enlarge the scheme, add more elements, but keep all the basic components.

Safety Helms

Helm CLI communicates with Chart Repo, interacts with Kubeconfig, work is transferred to the cluster to the Tiller component.

Tiller is represented by two objects:

  • Tiller-deploy svc, which exposes a certain service;
  • Tiller-deploy pod (in the diagram in a single instance in one replica), on which the entire load runs, which accesses the cluster.

For interaction, different protocols and schemes are used. From a security point of view, we are most interested in:

  • The mechanism by which Helm CLI accesses the chart repo: what is the protocol, is there authentication, and what can be done about it.
  • The protocol by which the Helm CLI, using kubectl, communicates with Tiller. This is an RPC server installed inside the cluster.
  • Tiller itself is available to microservices that are in the cluster and interacts with Kube-apiserver.

Safety Helms

Let's discuss all these areas in order.

RBAC

It's useless to talk about any security for Helm or any other service within the cluster unless RBAC is enabled.

It seems that this is not the most recent recommendation, but I'm sure many people still haven't enabled RBAC even in production, because it's a lot of fuss and there is a lot to configure. However, I urge you to do so.

Safety Helms

https://rbac.dev/ β€” site-lawyer for RBAC. There is a huge amount of interesting material collected there that will help you set up RBAC, show why it is good and how to live with it in principle in production.

I will try to explain how Tiller and RBAC work. Tiller runs inside the cluster under a certain service account. Typically, if RBAC is not configured, this will be the superuser. In the basic configuration, Tiller will be the admin. That is why it is often said that Tiller is an SSH tunnel to your cluster. In fact, this is the case, so you can use a separate specialized service account instead of the Default Service Account in the diagram above.

When you initialize Helm the first time you install it on a server, you can set up a service account with --service-account. This will allow you to use a user with the minimum required set of rights. True, you will have to create such a "garland": Role and RoleBinding.

Safety Helms

Unfortunately, Helm won't do this for you. You or your Kubernetes cluster administrator needs to prepare a set of Role, RoleBinding for the service-account in advance in order to pass the Helm.

The question arises - what is the difference between Role and ClusterRole? The difference is that ClusterRole works for all namespaces, as opposed to regular Role and RoleBinding, which only work for specialized namespaces. You can configure policies for the entire cluster and all namespaces, as well as personalized for each namespace individually.

It is worth mentioning that RBAC solves another big problem. Many complain that Helm, unfortunately, is not multitenancy (does not support multitenancy). If several teams consume a cluster and use Helm, it is impossible in principle to set up policies and differentiate their access within this cluster, because there is a certain service account from which Helm runs, and it creates all the resources in the cluster from under it, which sometimes very uncomfortable. This is true - as a binary file itself, as a process, Helm Tiller has no idea about multitenancy.

However, there is a great way that allows you to run Tiller multiple times in a cluster. There is no problem with this, Tiller can be run in every namespace. Thus, you can use RBAC, Kubeconfig as a context, and restrict access to a special Helm.

It will look like this.

Safety Helms

For example, there are two Kubeconfigs with context for different teams (two namespaces): X Team for the development team and an admin cluster. The admin cluster has its own wide Tiller, which is located in the Kube-system namespace, respectively, advanced service-account. And a separate namespace for the development team, they will be able to deploy their services to a special namespace.

This is a working approach, Tiller is not so gluttonous that it can greatly affect your budget. This is one of the quick fixes.

Feel free to configure Tiller separately and provide Kubeconfig with context for a team, for a specific developer, or for the environment: Dev, Staging, Production (it is doubtful that everything will be on the same cluster, however, it can be done).

Continuing our story, let's switch from RBAC and talk about ConfigMaps.

ConfigMaps

Helm uses ConfigMaps as its data store. When we talked about architecture, there was nowhere a database that would store information about releases, configurations, rollbacks, etc. ConfigMaps is used for this.

The main problem with ConfigMaps is known - they are not safe in principle, they unable to store sensitive data. We are talking about everything that should not get beyond the service, for example, passwords. The most native way for Helm right now is to move from using ConfigMaps to secrets.

This is done very simply. Override the Tiller setting and specify that the storage will be secrets. Then for each deployment you will receive not a ConfigMap, but a secret.

Safety Helms

You might argue that secrets themselves are a strange concept and not very secure. However, it should be understood that the Kubernetes developers themselves are doing this. Starting from version 1.10, i.e. for a long time, it is possible, at least in public clouds, to connect the correct storage to store secrets. Now the team is working on how to distribute access to secrets, individual pods or other entities even better.

It is better to translate Storage Helm into secrets, and in turn secure them centrally.

Of course it will stay 1 MB data storage limit. Helm here uses etcd as a distributed repository for ConfigMaps. And there they considered that this is a suitable data chunk for replications, etc. There is an interesting discussion on Reddit about this, I recommend finding this funny reading for the weekend or reading the squeeze here.

Chart Repos

Charts are the most socially vulnerable and can become a source of "Man in the middle", especially if you use a stock solution. First of all, we are talking about repositories that are exposed via HTTP.

Definitely, you need to expose Helm Repo over HTTPS - this is the best option and is inexpensive.

pay attention to chart signature mechanism. The technology is simple to disgrace. It's the same as what you use on GitHub, a normal PGP machine with public and private keys. Set up and be sure, having the right keys and signing everything, that this is really your chart.

Additionally, Helm client supports TLS (not in the sense of HTTP from the server side, but mutual TLS). You can use server and client keys to communicate. To be honest, I do not use such a mechanism because of my dislike for mutual certificates. Basically, chartmuseum - the main tool for setting Helm Repo for Helm 2 - also supports basic auth. You can use basic auth if it's more convenient and quieter.

There is also a plugin helm-gcs, which allows you to host Chart Repos on Google Cloud Storage. This is quite convenient, works great and is safe enough, because all the described mechanisms are utilized.

Safety Helms

If you enable HTTPS or TLS, use mTLS, enable basic auth to further reduce the risks, you will get a secure communication channel for Helm CLI and Chart Repo.

gRPC API

The next step is very responsible - to secure Tiller, which is located in the cluster and is, on the one hand, a server, on the other hand, it accesses other components and tries to pretend to be someone.

As I said, Tiller is a service that exposes gRPC, the Helm client comes to it via gRPC. By default, of course, TLS is disabled. Why this is done is a debatable question, it seems to me, in order to simplify the setup at the start.

For production and even for staging, I recommend enabling TLS on gRPC.

In my opinion, unlike mTLS for charts, this is appropriate here and is done very simply - generate a PQI infrastructure, create a certificate, run Tiller, transfer the certificate during initialization. After that, you can execute all Helm commands, pretending to be the generated certificate and private key.

Safety Helms

Thus, you will protect yourself from all requests to Tiller from outside the cluster.

So, we have secured the connection channel to Tiller, have already discussed RBAC and adjusted the rights of the Kubernetes apiserver, reduced the domain with which it can interact.

Protected Helm

Let's look at the final diagram. It's the same architecture with the same arrows.

Safety Helms

All connections can now be safely drawn in green:

  • for Chart Repo we use TLS or mTLS and basic auth;
  • mTLS for Tiller, and it is exposed as a gRPC service with TLS, we use certificates;
  • the cluster uses a special service account with Role and RoleBinding. 

We noticeably secured the cluster, but someone smart said:

β€œThere can be only one absolutely safe solution - a switched off computer, which is located in a concrete box and is guarded by soldiers.”

There are different ways to manipulate data and find new attack vectors. However, I am confident that these recommendations will allow you to implement the basic industry standard for security.

bonus

This part is not directly related to security, but it will also be useful. I'll show you some interesting things that few people know about. For example, how to search for charts - official and unofficial.

In the repository github.com/helm/charts now there are about 300 charts and two streams: stable and incubator. Anyone who contributes knows perfectly well how hard it is to get from incubator to stable, and how easy it is to fly out of stable. However, it's not the best tool to find charts for Prometheus and whatever else you like, for one simple reason - it's not a convenient portal to search for packages.

But there is a service hub.helm.sh, with which it is much more convenient to find charts. Most importantly, there are many more external repositories and almost 800 charms available. Plus, you can connect your repository if for some reason you don't want to submit your charts to stable.

Try hub.helm.sh and let's develop it together. This service is under a Helm project and you can even contribute to its UI if you're a frontend and just want to improve the look.

I also want to draw your attention to Open Service Broker API integration. It sounds cumbersome and incomprehensible, but it solves the problems that everyone faces. Let me explain with a simple example.

Safety Helms

We have a Kubernetes cluster where we want to run a classic WordPress application. As a rule, a database is needed for full functionality. There are many different solutions, for example, you can run your own statefull service. This is not very convenient, but many people do this.

Others, like us at Chainstack, use managed databases like MySQL or PostgreSQL for servers. Therefore, our databases are located somewhere in the cloud.

But a problem arises: we need to connect our service with the database, create a flavor of the database, pass the credential and manage it somehow. All of this is usually done manually by a system administrator or developer. And there is no problem when there are few applications. When there are a lot of them, you need a combine. There is such a combine - it's Service Broker. It allows you to use a special plug-in for a public cloud cluster and order resources from a provider through a Broker, as if it were an API. To do this, you can use native Kubernetes tools.

It's very simple. You can request, for example, Managed MySQL on Azure with a base tier (this can be configured). Using the Azure API, the database will be created and ready for use. You do not need to interfere with this, the plugin is responsible for this. For example, OSBA (Azure plugin) will return credential to service, pass it to Helm. You will be able to use WordPress with cloud MySQL, not deal with managed databases at all, and not worry about statefull services inside.

We can say that Helm acts as a glue that, on the one hand, allows you to deploy services, and on the other hand, consumes the resources of cloud providers.

You can write your own plugin and use all this history on-premise. Then you will simply have your own plugin for the corporate Cloud provider. I advise you to try this approach, especially if you have a large scale and want to quickly deploy dev, staging, or the entire infrastructure for a feature. This will make life easier for your operations or DevOps.

Another find that I have already mentioned is helm-gcs plugin, which allows you to use Google-buckets (object storage) to store Helm charts.

Safety Helms

You only need four commands to start using it:

  1. install the plugin;
  2. initiate it;
  3. set the path to the bucket, which is in gcp;
  4. publish charts in the standard way.

The beauty is that the native gcp method for authorization will be used. You can use a service account, a developer account, whatever. It is very convenient and does not cost anything to operate. If you, like me, promote the opsless philosophy, then this will be very convenient, especially for small teams.

Alternatives

Helm is not the only service management solution. There are a lot of questions to him, which is probably why the third version appeared so quickly. Of course there are alternatives.

These can be either specialized solutions such as Ksonnet or Metaparticle. You can use your classic infrastructure management tools (Ansible, Terraform, Chef, etc.) for the same purposes I mentioned.

Finally there is a solution operator frameworkwhich is growing in popularity.

The Operator Framework is the main Helm alternative to look out for.

It is more native to CNCF and Kubernetes, but the barrier to entry is much higher, you need to code more and describe manifests less.

There are various addons, such as Draft, Scaffold. They make life a lot easier, for example, they simplify the cycle for developers to send and run Helm to deploy a test environment. I would call them empowerers.

Here's a visual graph of where everything is.

Safety Helms

The abscissa is the level of your personal control over what is happening, the ordinate is the level of Kubernetes nativeness. Helm version 2 is somewhere in between. In version 3, not huge, but both the control and the level of nativeness are improved. Ksonnet-level solutions are still inferior even to Helm 2. However, they are worth a look to know what else there is in this world. Of course, your configuration manager will be under your control, but it is absolutely not native to Kubernetes.

The Operator Framework is absolutely native to Kubernetes and allows you to manage it much more elegantly and scrupulously (but keep in mind the level of entry). Rather, it is suitable for a specialized application and creating a management for it, rather than a mass combine for packaging a huge number of applications using Helm.

Extenders just slightly improve control, complement workflow, or cut corners on CI/CD pipelines.

Future Helms

The good news is that Helm 3 is coming. The alpha version of Helm 3.0.0-alpha.2 has already been released, you can try it. It is quite stable, but the functionality is still limited.

Why do you need Helm 3? First of all, this is a story about Tiller's disappearance, as a component. This, as you already understand, is a huge step forward, because from the point of view of the security of the architecture, everything is simplified.

When Helm 2 was created, which was during the Kubernetes 1.8 days or even earlier, many of the concepts were immature. For example, the CRD concept is now being actively implemented, and Helm will use CRDto store structures. It will be possible to use only the client and not keep the server part. Accordingly, use native Kubernetes commands to work with structures and resources. This is a huge step forward.

The support for native OCI repositories (Open Container Initiative). This is a huge initiative, and Helm is interested primarily in order to post their charts. It comes to the point that, for example, Docker Hub supports many OCI standards. I'm not guessing, but maybe the classic Docker repository providers will start letting you host their Helm charts.

A controversial story for me is Lua support, as a templating engine for writing scripts. I'm not a big fan of Lua, but it will be completely optional. I checked it 3 times - using Lua will not be necessary. So whoever wants to be able to use Lua, whoever likes Go, join our huge camp and use go-tmpl for that.

Finally, what I was definitely missing was appearance of the schema and validation of data types. There will be no more problems with int or string, no need to wrap zero in double quotes. A JSONS schema will appear that will allow you to explicitly describe this for values.

Will be heavily reworked event-driven model. It has already been conceptualized. Look at the Helm 3 branch and you will see how many events and hooks and other things have been added, which greatly simplifies and, on the other hand, adds control over the deployment processes and reactions on them.

Helm 3 will be easier, safer, and more interesting, not because we don't like Helm 2, but because Kubernetes is getting more advanced. Accordingly, Helm can use Kubernetes developments and create excellent managers for Kubernetes on it.

Another good news is that the DevOpsConf Alexander Khayorov will tell can containers be safe? Recall that the conference on the integration of development, testing and operation processes will be held in Moscow September 30 and October 1. You can still until August 20 submit a report and tell about your experience one of many tasks of the DevOps approach.

Follow the conference checkpoints and news in mailing list ΠΈ telegram channel.

Source: habr.com

Add a comment