Kubecost Overview to Save Money on Kubernetes in the Clouds

Kubecost Overview to Save Money on Kubernetes in the Clouds

Currently, more and more companies are transferring their infrastructure from iron servers and their own virtual machines to the clouds. This decision is easy to explain: there is no need to take care of the hardware, the cluster is easily configured in many different ways ... and most importantly, the available technologies (like Kubernetes) allow you to easily scale computing power depending on the load.

The financial aspect is always important. The tool discussed in this article is designed to help reduce budgets when using cloud infrastructure with Kubernetes.

Introduction

Kubecost is a California-based startup from Google that creates a solution for calculating the cost of infrastructure in cloud services (inside a Kubernetes cluster + shares), finding bottlenecks in the cluster settings and sending notifications to Slack accordingly.

We have clients with Kubernetes both in the familiar AWS and GCP clouds, and the more rare Azure for the Linux community - in general, on all platforms supported by Kubecost. For some of them, we calculate the costs of intra-cluster services ourselves (using a methodology similar to that used by Kubecost), and also monitor infrastructure costs and try to optimize them. Therefore, it is logical that we were interested in the possibility of automating such tasks.

The source code of the main Kubecost module is open under the terms of the Open Source license (Apache License 2.0). It can be used freely and the available features should be enough for small projects. However, business is business: the rest of the product is closed, it can be used by paid subscriptions, which also includes commercial support. In addition, the authors offer a free license for small clusters (1 cluster with 10 nodes - this limit has expanded to 20 nodes at the time of writing) or a trial period with full capabilities for 1 month.

How it all works

So, the main part of Kubecost is the application cost model, written in Go. A Helm chart describing the entire system is called cost analyzer and at its core is an assembly from a cost-model with Prometheus, Grafana and several dashboards.

Generally speaking, cost-model has its own web interface that shows charts and detailed cost statistics in tabular form, as well as, of course, cost optimization tips. The dashboards presented in Grafana are an earlier stage in the development of Kubecost and contain largely the same data as the cost-model, supplementing them with the usual statistics on CPU/memory/network/disk space consumption in the cluster and its components.

How does Kubecost work?

  • Cost-model gets service prices via cloud provider API.
  • Further, depending on the iron type of the node and the region, the cost by nodes is calculated.
  • Based on the cost of running nodes, each leaf pod receives a cost per hour of CPU usage, a GB of memory usage, and a cost per GB of data storage, depending on the node it was running on or the storage class.
  • Based on the cost of the work of individual pods, payment for namespaces, services, Deployments, StatefulSets is considered.
  • To calculate statistics, the metrics provided by kube-state-metrics and node-exporter are used.

It is important to consider that Kubecost by default counts only resources available in Kubernetes. External databases, GitLab servers, S3 storages and other services that are not in the cluster (albeit located in the same cloud) are not visible to it. Although for GCP and AWS, you can add the keys of your service accounts and calculate everything together.

Installation

Kubecost requires:

  • Kubernetes version 1.8 and above;
  • kube-state-metrics;
  • Prometheus;
  • node-exporter.

It so happened that in our clusters all these conditions were met in advance, so it turned out to be enough just to specify the correct endpoint for access to Prometheus. However, the official kubecost Helm chart contains everything you need to run on a bare cluster.

There are several ways to install Kubecost:

  1. The standard installation method described in instructions on the developer's website.Required add the cost-analyzer repository to Helm, then install the chart. It remains only to forward the port to yourself and finish the settings to the desired state manually (via kubectl) and / or using the cost-model web interface.

    We didn’t even try this method, since we don’t use third-party ready-made configurations, but it looks like a good “just try it for yourself” option. If you already have part of the system components installed or you want to fine-tune it, it is better to consider the second way.

  2. use in essence same chart, but configure and install it yourself in any convenient way.

    As already mentioned, in addition to the kubecost itself, this chart contains Grafana and Prometheus charts, which can also be customized as desired.

    Available on the chart values.yaml for cost-analyzer allows you to configure:

    • a list of cost-analyzer components that need to be deployed;
    • your endpoint for Prometheus (if you already have one);
    • domains and other ingress settings for cost-model and Grafana;
    • annotations for pods;
    • the need to use persistent storage and their size.

    For a complete list of available configuration options with descriptions, see documentation.

    Since kubecost in the basic version does not know how to restrict access, you will need to immediately configure basic-auth for the web panel.

  3. Install only the core of the system — cost-model. To do this, you need to have Prometheus installed in the cluster and specify the corresponding value of its address in the variable prometheusEndpoint for Helm. After that, apply set of YAML configurations in the cluster.

    Again, you will have to manually add Ingress with basic-auth. And finally, you need to add a section to collect cost-model metrics in extraScrapeConfigs in Prometheus config:

    - job_name: kubecost
      honor_labels: true
      scrape_interval: 1m
      scrape_timeout: 10s
      metrics_path: /metrics
      scheme: http
      dns_sd_configs:
      - names:
        - <адрес вашего сервиса kubecost>
        type: 'A'
        port: 9003

What do we get?

With a full installation, we have at our disposal the kubecost and Grafana web panel with a set of dashboards.

Total cost, displayed on the main screen, actually shows the estimated cost of resources for the month. This predictable a price that displays the cost of using the cluster (per month) at the current level of resource consumption.

This metric is more for cost analysis and optimization. It is not very convenient to look at the total costs for abstract July in kubecost: for this you will have to go to billing. But you can see the costs broken down by namespaces, labels, pods for 1/2/7/30/90 days, which billing will never show you.

Kubecost Overview to Save Money on Kubernetes in the Clouds

Speaking of labels. You should immediately go to the settings and set the names of the labels that will be used as additional categories for grouping costs:

Kubecost Overview to Save Money on Kubernetes in the Clouds

You can hang any labels on them - it’s convenient if you already have your own labeling system.

You can also change the address of the API endpoint to which the cost-model is connected, set the size of the discount in GCP and set your own prices for resources and currency for their measurement (for some reason, the feature does not affect the Total cost).

Kubecost can display various problems in the cluster (and even alert in case of danger). Unfortunately, the option is not configurable, and therefore - if you have environments for developers and they are used, you will constantly see something like this:

Kubecost Overview to Save Money on Kubernetes in the Clouds

An important tool is Cluster Savings. It measures the activity of pods (consumption of resources, including network ones), and also considers how much money can be saved and on what.

It may seem that optimization tips are pretty obvious, but experience suggests that there is still something to look at. In particular, it monitors the network activity of pods (Kubecost suggests paying attention to inactive ones), compares the requested and actual consumption of memory and CPU, as well as the CPU used by the cluster nodes (offers to collapse several nodes into one), disk load and a couple of dozen more parameters.

As with any optimization question, resource optimization based on Kubecost data requires treat with caution. For example, Cluster Savings suggests deleting nodes, claiming that it is safe, but does not take into account the presence of pods deployed on them in node-selectors and taints that are not available on other nodes. And in general, even the authors of the product in their recent article (by the way, it can be very useful for those who are interested in the topic of the project) they recommend not to rush headlong into cost optimization, but to approach the issue thoughtfully.

Results

After using kubecost for a month on a couple of projects, we can conclude that this is an interesting (and also easy to learn and install) tool for analyzing and optimizing the costs of cloud provider services used for Kubernetes clusters. The calculations are very accurate: in our experiments, they coincided with what the providers actually required.

There were some drawbacks: there are non-critical bugs, the functionality in some places does not cover the needs specific to some projects. However, if you need to quickly understand where the money is going and what can be “cut” to consistently reduce your cloud bill by 5-30% (which happened in our case), this is a great option.

PS

Read also on our blog:

Source: habr.com

Add a comment