werf 1.1 release: builder improvements today and plans for the future

werf 1.1 release: builder improvements today and plans for the future

yard is our open source GitOps CLI utility for building and delivering applications to Kubernetes. As promised v1.0 release marked the beginning of adding new features to werf and revising the usual approaches. Now we are pleased to present the v1.1 release, which is a big step in development and a reserve for the future collector werf. The version is currently available in channel 1.1 ea.

The basis of the release is a new stage storage architecture and optimization of both builders (for Stapel and Dockerfile). The new storage architecture opens up possibilities for distributed builds from multiple hosts and parallel builds on the same host.

Optimization of work includes getting rid of unnecessary calculations at the stage of calculating stage signatures and changing the mechanisms for calculating file checksums to more efficient ones. This optimization reduces the average project build time with werf. And idle builds when all stages exist in the cache stages-storageare now really fast. In most cases, restarting the build will be faster than 1 second! This also applies to the procedures for verifying stages in the process of team work. werf deploy ΠΈ werf run.

Also in this release, a strategy for tagging images by content appeared - content-based tagging, which is now enabled by default and is the only recommended one.

Let's take a closer look at the key innovations in werf v1.1, and at the same time talk about plans for the future.

What has changed in werf v1.1?

New stage naming format and algorithm for selecting stages from the cache

New stage name generation rule. Now each stage assembly generates a unique stage name, which consists of 2 parts: a signature (as it was in v1.0) plus a unique temporary identifier.

For example, the full name of a stage image might look like this:

werf-stages-storage/myproject:d2c5ad3d2c9fcd9e57b50edd9cb26c32d156165eb355318cebc3412b-1582656767835

... or in general terms:

werf-stages-storage/PROJECT:SIGNATURE-TIMESTAMP_MILLISEC

Here:

  • SIGNATURE is the signature of the stage, which represents the ID of the stage's content and depends on the Git revision history that led to that content;
  • TIMESTAMP_MILLISEC is a guaranteed unique image identifier that is generated when a new image is built.

The algorithm for selecting stages from the cache is based on checking the relationship of Git commits:

  1. Werf calculates the signature of some stage.
  2. Π’ stages-storage there may be several stages for a given signature. Werf selects all stages matching the signature.
  3. If the current stage is linked to Git (git-archive, custom stage with Git patches: install, beforeSetup, setup; or git-latest-patch), then werf selects only those stages associated with the commit that is an ancestor of the current commit (on which the build was called).
  4. Of the remaining suitable stages, one is selected - the oldest by the date of creation.

A stage for different Git branches can have the same signature. But werf will prevent the cache associated with different branches from being used between those branches, even if the signatures matched.

β†’ Documentation.

New algorithm for creating and saving stages in the stages repository

If, during the selection of stages from the cache, werf does not find a suitable stage, then the process of building a new stage is initiated.

Note that multiple processes (on the same or multiple hosts) can start building the same stage at roughly the same time. Werf uses an optimistic blocking algorithm stages-storage at the moment of saving a freshly collected image in stages-storage. Thus, when the build of the new stage is ready, werf blocks stages-storage and saves a freshly built image there only if a suitable image does not already exist there (by signature and other parameters - see the new algorithm for selecting stages from the cache).

A freshly compiled image is guaranteed to have a unique identifier by TIMESTAMP_MILLISEC (see new stage naming format). In case in stages-storage a suitable image is found, werf will discard the freshly built image and use the cached image.

In other words: the first process to finish building the image (the fastest one) will get the right to store it in stages-storage (and then this single image will be used for all builds). A slow build process will never block a faster process from saving the build results of the current stage and moving on to the next build.

β†’ Documentation.

Improved performance of the Dockerfile builder

At the moment, the stage pipeline for an image built from a Dockerfile consists of one stage βˆ’ dockerfile. When calculating the signature, the checksum of the files is considered contextthat will be used during assembly. Prior to this improvement, werf recursively went through all the files and obtained a checksum by summing the context and mode of each file. Since v1.1, werf can use calculated checksums stored in a Git repository.

The basis of the algorithm is git ls-tree. The algorithm takes into account entries in .dockerignore and iterates recursively through the file tree only when necessary. Thus, we got rid of reading the file system, and the dependence of the algorithm on the size context is not significant.

The algorithm also checks untracked files and, if necessary, takes them into account in the checksum.

Improved performance when importing files

Versions of werf v1.1 use an rsync server when importing files from artifacts and images. Previously, importing was done in two steps using mounting a directory from the host system.

The performance of imports on macOS is no longer limited by Docker volumes, and imports run in the same time as on Linux and Windows.

content-based tagging

Werf v1.1 supports the so-called image content tagging βˆ’ content-based tagging. The tags of the resulting Docker images depend on the contents of those images.

When running the command werf publish --tags-by-stages-signature or werf ci-env --tagging-strategy=stages-signature published images of the so-called stage signature image. Each image is tagged with its own signature of the stages of this image, which is calculated according to the same rules as the regular signature of each of the stages separately, but is a generalized identifier of the image.

The signature of image stages depends on:

  1. the content of this image;
  2. Git revision history that led to this content.

There are always dummy commits in a Git repository that do not change the contents of the image files. For example, commits with only comments, or merge commits, or commits that change those files in Git that will not be imported into the image.

When using content-based tagging, the problems of unnecessary restarts of application pods in Kubernetes due to changes in the image name are solved, even if the contents of the image have not changed. By the way, this is one of the reasons why it is difficult to store many microservices of one application in a single Git repository.

Also, content-based tagging is a more reliable tagging method than Git branch tagging, because the content of the resulting images does not depend on the order of execution of pipelines in the CI system to build multiple commits of the same branch.

It's important: starting from now stages signature - Is the only recommended tagging strategy. It will be used by default in the command werf ci-env (unless you explicitly specify a different tagging scheme).

β†’ Documentation. This feature will also be covered in a separate post. UPDATED (April 3): Detailed article published.

Logging levels

The user has the opportunity to control the output, set the logging level and work with debugging information. Options added --log-quiet, --log-verbose, --log-debug.

By default, the output contains a minimum of information:

werf 1.1 release: builder improvements today and plans for the future

When using verbose output (--log-verbose) you can see how werf works:

werf 1.1 release: builder improvements today and plans for the future

detailed output (--log-debug), in addition to werf debugging information, also contains logs of the libraries used. For example, you can see how the interaction with the Docker Registry takes place, as well as fix the places where a significant amount of time is spent:

werf 1.1 release: builder improvements today and plans for the future

Future plans

Attention! Features described below, marked v1.1 will become available already in this version, many of them - in the near future. Updates will come through auto-updates when using multiwerf. These features do not affect the stable part of v1.1 functions, their appearance will not require manual user intervention in existing configurations.

Full support for various Docker Registry implementations (NEW)

  • Version: v1.1
  • Dates: March
  • Issue

The goal is for the user to use an arbitrary implementation without restrictions when using werf.

At the moment, we have identified the following set of solutions for which we are going to guarantee full support:

  • Default (library/registry)*,
  • AWS ECR
  • Azure*,
  • docker Hub,
  • GCR*,
  • GitHub Packages
  • GitLab Registry*,
  • Harbour*,
  • Quay.

An asterisk marks solutions that are currently fully supported by werf. For the rest, there is support, but with limitations.

Two main problems can be identified:

  • Some solutions do not support deleting tags using the Docker Registry API, preventing users from using werf's automatic cleanup. This is true for AWS ECR, Docker Hub, and GitHub Packages.
  • Some solutions do not support the so-called nested repositories (Docker Hub, GitHub Packages and Quay) or they do, but the user must create them manually using the UI or API (AWS ECR).

We are going to solve these and other problems using native APIs for solutions. This task also includes covering the full cycle of werf work for each of them with tests.

Distributed image building (↑)

  • Version: v1.2 v1.1 (priority for the implementation of this feature has been increased)
  • Dates: March-April March
  • Issue

At the moment, werf v1.0 and v1.1 can only be used on one dedicated host for building and publishing images and deploying an application to Kubernetes.

To enable werf's distributed work, when the build and deployment of applications in Kubernetes run on several arbitrary hosts and these hosts do not save their state between builds (temporary runners), werf is required to implement the ability to use the Docker Registry as stage storage.

Earlier, when the werf project was also called dapp, it had such an opportunity. However, we ran into a number of issues that need to be taken into account when implementing this feature in werf.

Note. This feature does not imply the work of the collector inside the Kubernetes pods, because. to do this, you need to get rid of the dependence on the local Docker server (there is no access to the local Docker server in the Kubernetes pod, because the process itself is running in a container, and werf does not and will not support working with the Docker server over the network). Support for working in Kubernetes will be implemented separately.

Official support for GitHub Actions (NEW)

  • Version: v1.1
  • Dates: March
  • Issue

Includes werf documentation (sections reference ΠΈ ), as well as the official GitHub Action for working with werf.

In addition, it will allow werf to work on ephemeral runners.

The mechanics of user interaction with the CI system will be based on placing labels on pull-requests to initiate certain actions to build / roll out the application.

Local development and deployment of applications with werf (↓)

  • Version: v1.1
  • Dates: January-February April
  • Issue

The main goal is to achieve a single unified config for deploying applications both locally and in production, without complex actions, out of the box.

Werf is also required to work in such a way that it will be convenient to edit the application code and get instant feedback from the running application for debugging.

New cleaning algorithm (NEW)

  • Version: v1.1
  • Dates: April
  • Issue

In the current version of werf v1.1 in the procedure cleanup there is no provision for clearing images for the content-based tagging scheme - these images will accumulate.

Also, the current version of werf (v1.0 and v1.1) uses different cleanup policies for images published by tagging schemes: Git branch, Git tag, or Git commit.

A new algorithm for cleaning images, unified for all tagging schemes, based on the commit history in Git, has been invented:

  • Keep no more than N1 images associated with N2 latest commits for each of the git HEADs (branches and tags).
  • Keep no more than N1 stage images associated with N2 latest commits for each of the git HEADs (branches and tags).
  • Store all images that are used in any Kubernetes cluster resources (all configuration file kube contexts and namespaces are scanned; you can limit this behavior with special options).
  • Store all images that are used in resource configuration manifests saved in Helm releases.
  • An image can be removed if it is not associated with any HEAD from git (for example, because the corresponding HEAD itself has been removed) and is not used in any of the manifests in the Kubernetes cluster and Helm releases.

Parallel Image Build (↓)

  • Version: v1.1
  • Dates: January-February April*

The current version of werf collects the images and artifacts described in werf.yaml, sequentially. It is necessary to parallelize the process of assembling independent stages of images and artifacts, as well as provide a convenient and informative output.

*Note: The deadline has been pushed back due to the increased focus on implementing distributed builds, which will add more horizontal scaling options, as well as using werf with GitHub Actions. Parallel build is the next optimization step, which gives vertical scalability when building a single project.

Switching to Helm 3 (↓)

  • Version: v1.2
  • Dates: February-March May*

Includes migration to a new codebase Helm 3 and a proven, convenient way to migrate existing installations.

*Note: migrating to Helm 3 will not add significant features to werf, because all the key features of Helm 3 (3-way-merge and no tiller) are already implemented in werf. Moreover, werf has additional features in addition to those indicated. However, this transition remains in our plans and will be implemented.

Jsonnet to describe Kubernetes configuration (↓)

  • Version: v1.2
  • Dates: January-February April-May

Werf will support the configuration description for Kubernetes in Jsonnet format. At the same time, werf will remain compatible with Helm and there will be a choice of description format.

The reason is that Go patterns are considered by many people to have a high threshold of entry, and the understandability of the code for those patterns also suffers.

The possibility of introducing other Kubernetes configuration description systems (for example, Kustomize) is also being considered.

Working inside Kubernetes (↓)

  • Version: v1.2
  • Dates: April-May May-June

Purpose: to provide the assembly of images and delivery of the application using runners in Kubernetes. Those. new images can be built, published, cleaned, and deployed directly from Kubernetes pods.

To implement this feature, you first need the ability to distribute images (see point above).

It also requires support for the build mode without a Docker server (i.e. Kaniko-like build or build in userspace).

Werf will support Kubernetes building not only with a Dockerfile, but also with its Stapel builder with incremental rebuilds and Ansible.

Move towards open development

We love our communityGitHub, Telegram) and we want more and more people to help make werf better, to understand the direction in which we are moving, and to participate in the development.

More recently, it has been decided to switch to GitHub project boards in order to slightly open the workflow of our team. Now you can see the nearest plans, as well as current work in the following areas:

A lot of work has been done with issues:

  • Removed irrelevant.
  • The existing ones are brought to a single format, a sufficient number of details and details.
  • Added new issues with ideas and suggestions.

How to enable version v1.1

The version is currently available in channel 1.1 ea (in channels stable ΠΈ rock-solid releases will appear as stabilization, however ea by itself is already stable enough to use, tk. passed through the channels alpha ΠΈ beta). Activated via multiwerf in the following way:

source $(multiwerf use 1.1 ea)
werf COMMAND ...

Conclusion

The new stage storage architecture and builder optimization for Stapel and Dockerfile builders open up opportunities to implement distributed and parallel builds in werf. These features will soon appear in the same v1.1 release and will be automatically available through the auto-update mechanism (for users multiwerf).

In this release, a tagging strategy for the content of images has been added - content-based tagging, which became the default strategy. And also the log of the main commands was redesigned: werf build, werf publish, werf deploy, werf dismiss, werf cleanup.

The next essential step will be the addition of distributed assemblies. Distributed builds have become more of a priority than parallel builds since v1.0 because they add more value to werf: vertical scaling of builders and support for ephemeral builders on various CI/CD systems, and the ability to make official support for GitHub Actions. Therefore, the timing of the implementation of parallel builds has been shifted. However, we are working to implement both options as soon as possible.

Follow the news! And don't forget to visit us GitHubto create an issue, find an existing one and upvote it, create a PR, or just watch the development of the project.

PS

Read also on our blog:

Source: habr.com

Add a comment