DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Part 1: Web / Android

Note: this article is a translation into Russian of the original article DevOps tools are not only for DevOps. Building test automation infrastructure from scratch. However, all illustrations, references, quotations and terms are kept in the original language in order to avoid distortion of meaning when translated into Russian. I wish you a pleasant study!

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Currently, the DevOps specialty is one of the most sought after in the IT industry. If you go to popular job search sites and filter by salary, you'll see that DevOps-related jobs are at the top of the list. However, it is important to understand that this mainly refers to the position of 'Senior', which implies that the candidate has a high level of skills, knowledge of technologies and tools. This also comes with a high degree of responsibility associated with the smooth operation of production. However, we began to forget what DevOps is. Initially, it was not any particular person or department. If we look for definitions of this term, we will find many beautiful and correct nouns, such as methodology, practices, cultural philosophy, a group of concepts, and so on.

My specialization is a QA automation engineer, but I believe that it should not be associated only with writing auto-tests or developing test framework architecture. In 2020, automation infrastructure knowledge is also essential. This allows you to organize the automation process yourself, from running tests to providing results to all interested parties in accordance with your goals. As a result, DevOps skills are a must for this job. And all this is good, but, unfortunately, there is a problem (spoiler: this article attempts to simplify this problem). It lies in the fact that DevOps is difficult. And this is obvious, because companies will not pay much for something that is easy to do ... In the world of DevOps, there are a large number of tools, terms, practices that need to be mastered. This is especially difficult at the beginning of a career and depends on the accumulated technical experience.

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch
Source: http://maximelanciauxbi.blogspot.com/2017/04/devops-tools.html

At this point, we will perhaps conclude with an introductory part and focus on the purpose of this article. 

What is this article about

In this article, I'm going to share my experience of building a test automation infrastructure. There are many sources of information on the Internet about various tools and how to use them, but I would like to consider them exclusively in the context of automation. I believe that many automation engineers are familiar with the situation when the developed tests, except for yourself, are not run by anyone and do not care about their support. As a result, the tests become outdated and you have to spend time updating them. Again, at the beginning of a career, this can be quite a challenge: to correctly decide which tools should help eliminate a given problem, how to select, configure and maintain them. Some testers turn to DevOps (people) for help and, let's be honest, this approach works. In many cases, this may be the only option since we don't have visibility into all dependencies. But, as we know, DevOps are very busy guys, because they have to think about the infrastructure of the entire company, deployment, monitoring, microservices and other similar tasks depending on the organization / team. As is usually the case, automation is not a priority. In that case, we should try to do our best from start to finish. This will reduce dependencies, speed up our workflow, improve our skills, and allow us to see the bigger picture.

The article presents the most popular and popular tools and shows how to use them to build an automation infrastructure step by step. Each group is represented by tools that have been tested on personal experience. But that doesn't mean you should use the same. The tools themselves are not important, they appear and become obsolete. Our engineering task is to understand the basic principles: why do we need this group of tools and what work tasks can we solve with their help. Therefore, at the end of each section, I leave links to similar tools that may be used in your organization.

What is not in this article

I repeat once again that the article is not about specific tools, so there will be no inserts of code from the documentation and descriptions of specific commands. But at the end of each section, I leave links for detailed study.

This is done because: 

  • this material is very easy to find in various sources (documentation, books, video courses);
  • if we start to go deeper, we will have to write 10, 20, 30 parts of this article (whereas 2-3 are planned);
  • I just don't want to waste your time as you may want to use other tools to achieve the same goals.

Practice

I would really like this material to be useful for every reader, and not just be read and forgotten. In any study, practice is a very important component. For this I prepared GitHub repository with a step-by-step guide on how to do everything from scratch. There is also some homework waiting for you to make sure you didn't mindlessly copy the lines of commands you run.

Plan

Step
Technology
Tools

1
Local running (prepare web / android demo tests and run it locally) 
Node.js, Selenium, Appium

2
Version control systems 
Go

3
Containerization
Docker, Selenium grid, Selenoid (Web, Android)

4
CI/CD
Gitlab CI

5
Cloud platforms
Google Cloud Platform

6
Orchestration
Kubernetes

7
Infrastructure as a code (IaC)
Terraform, Ansible

Structure of each section

To keep the narrative clear, each section is described as follows:

  • a brief description of the technology,
  • value for automation infrastructure,
  • illustration of the current state of the infrastructure,
  • links to study
  • similar tools.

1. Run tests locally

Brief description of technology

This is just a preparatory step to run the demo tests locally and verify that they pass. The practical part uses Node.js, but the programming language and platform are also not important and you can use those that are used in your company. 

However, as automation tools, I recommend using Selenium WebDriver for web platforms and Appium for Android platform, respectively, since in the next steps we will use Docker images that are tailored to work specifically with these tools. Moreover, referring to the requirements in the vacancies, these tools are the most in demand in the market.

As you may have noticed, we only cover web and Android tests. Unfortunately IOS is a completely different story (thanks Apple). I plan to demonstrate IOS-related solutions and practices in the following parts.

Value for automation infrastructure

From an infrastructure point of view, running locally has no value. You only check that the tests work on the local machine in local browsers and simulators. But in any case, it is a necessary starting point.

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools

  • any programming language that you like, in conjunction with Selenium / Appium - tests;
  • any tests;
  • any test runner.

2. Version control systems (Git)

Brief description of technology

It will not be a big discovery for anyone if I say that the version control system is an extremely important part of development, both in a team and individually. Based on various sources, it is safe to say that Git is the most popular representative. A version control system provides many benefits, such as code sharing, version storage, restoring to previous branches, project history monitoring, and backups. We will not discuss each item in detail, as I am sure that you are very familiar with this and use it in your daily work. But if suddenly not, then I recommend that you stop reading this article and fill this gap as soon as possible.

Value for automation infrastructure

And here you can ask a reasonable question: β€œWhy is he telling us about Git? Everyone knows this and uses it both for development code and for auto-test code.” You will be absolutely right, but in this article we are talking about infrastructure and this section acts as a preview for section 7: "Infrastructure as Code (IaC)". For us, this means that the entire infrastructure, including the test one, is described in the form of code, so we can also apply versioning systems to it and get similar benefits as for development and automation code.

We'll cover IaC in more detail in step 7, but even now you can start using Git locally by creating a local repository. The overall picture will be expanded when we add a remote repository to the infrastructure.

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools

3. Containerization (Docker)

Brief description of technology

To demonstrate how containerization has changed the game, let's go back a few decades. In those days, people purchased and used server machines to run applications. But in most cases, the necessary required resources for launch were not known in advance. As a result, companies spent money on buying expensive powerful servers, but some of these capacities were not fully utilized.

The next stage of evolution was virtual machines (VMs), which solved the problem of spending money on unused resources. This technology made it possible to run applications independently from each other inside the same server, allocating a completely isolated space. But, unfortunately, any technology has its drawbacks. Running a VM requires a full operating system that consumes CPU, RAM, storage and, depending on the OS, license costs need to be considered. These factors affect download speed and make portability more difficult.

And now we come to containerization. Once again, this technology solves the previous problem, as containers do not use a full OS, which frees up a lot of resources and provides a fast and flexible portability solution.

Of course, containerization technology is nothing new and was first introduced in the late 70s. In those days, a lot of research, developments, and attempts were carried out. But it was Docker who adapted this technology and made it easily accessible to the masses. Nowadays, when we talk about containers, in most cases we mean Docker. When we talk about Docker containers, we mean Linux containers. We can use Windows and macOS systems to run containers, but it is important to understand that in this case there is an additional layer. For example, Docker on Mac silently runs containers inside a lightweight Linux VM. We will return to this topic when we discuss running Android emulators inside containers, as there is a very important nuance here that needs to be analyzed in more detail.

Value for automation infrastructure

We found that containerization and Docker are cool. Let's look at this in the context of automation, because every tool or technology must solve a problem. Let's denote the obvious problems of test automation in the context of UI tests:

  • a huge number of dependencies when installing Selenium and especially Appium;
  • compatibility issues between versions of browsers, simulators and drivers;
  • lack of isolated space for browsers / simulators, which is especially critical for parallel launch;
  • hard to manage and maintain if you need to run 10, 50, 100 or even 1000 browsers at the same time.

But since Selenium is the most popular automation tool and Docker is the most popular containerization tool, it shouldn't come as a surprise to anyone that someone has tried to combine the two to get a powerful tool to solve the aforementioned problems. Let's consider these solutions in more detail. 

selenium grid in docker

This tool is the most popular in the Selenium world for running multiple browsers on multiple machines and managing them from a central host. To run, you need to register at least 2 parts: Hub and Node(s). The Hub is the central node that receives all requests from tests and distributes them to the appropriate Nodes. For each Node, we can set up a specific configuration, for example, by specifying the desired browser and its version. However, we still need to take care of compatible browser drivers ourselves and install them on the correct Nodes. For this reason, Selenium grid is not used in its purest form, except when we need to work with browsers that cannot be installed on Linux OS. For all other cases, using Docker images to run Selenium grid Hub and Nodes is a much more flexible and correct solution. This approach greatly simplifies node management, since we can choose the image we need with compatible versions of browsers and drivers already installed.

Despite the negative reviews about the stability of the work, especially when running a large number of Nodes in parallel, Selenium grid is still the most popular tool for running Selenium tests in parallel. It is important to note that various improvements and modifications to this tool are constantly appearing in open-source, which are struggling with various bottlenecks.

Selenoid for Web

This tool is a breakthrough in the world of Selenium as it works out of the box and has made the life of many automation engineers much easier. First of all, this is not just another modification of Selenium grid. Instead, the developers created a completely new version of Selenium Hub in Golang, which, in conjunction with lightweight Docker images for various browsers, gave impetus to the development of test automation. Moreover, in the case of Selenium Grid, we must determine all the required browsers and their versions in advance, which is not a problem when working with only one browser. But when it comes to multiple supported browsers, Selenoid is the number one choice thanks to its 'browser on demand' feature. All that is required of us is to upload the necessary images with browsers in advance and update the configuration file with which Selenoid interacts. After Selenoid receives a request from the tests, it will automatically launch the desired container with the desired browser. When the test completes, Selenoid will set aside the container, thus freeing up resources for the next requests. This approach completely eliminates the well-known problem of 'degradation of nodes' that we often see in Selenium grid.

But alas, Selenoid is still not a silver bullet. We've got the 'Browser on Demand' feature, but the 'Resources on Demand' feature is still not available. To use Selenoid, we must deploy it on a physical hardware or VM, which means we must know in advance how many resources to allocate. I suppose this is not a problem for small projects that run 10, 20 or even 30 browsers in parallel. But what if we need 100, 500, 1000 or more? It doesn't make any sense to maintain and pay for so many resources all the time. In sections 5 and 6 of this article, we will discuss solutions that allow you to scale, thereby significantly reducing the costs of the company.

Selenoid for Android

After the success of Selenoid as a web automation tool, people wanted something similar for Android. And it happened - Selenoid was released with Android support. From a high-level user point of view, the principle of operation is similar to web automation. The only difference is that instead of containers with browsers, Selenoid runs containers with Android emulators. In my opinion, it is currently the most powerful free tool for running Android tests in parallel.

I would really not like to talk about the negative aspects of this tool, because I really like it very much. But still, there are the same drawbacks that apply to web automation related to scaling. In addition to this, there is another limitation that needs to be mentioned, which may come as a surprise if we are setting up the tool for the first time. To run Android images, we need a physical machine or VM with nested virtualization support. In a practical guide, I demonstrate how to activate this on a Linux VM. However, if you are a macOS user and want to deploy Selenoid locally, it will not be possible to run Android tests. But you can always run a Linux VM locally with 'nested virtualisation' configured and deploy Selenoid internally.

Illustration of the current state of the infrastructure

In the context of this article, we will add 2 tools to illustrate the infrastructure. They are Selenium grid for web tests and Selenoid for Android tests. In the GitHub tutorial, I'll also show you how to use Selenoid to run web tests. 

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools

  • There are other containerization tools, but Docker is the most popular. If you want to try something else, then please note that the tools that we looked at for running Selenium tests in parallel will not work out of the box.  
  • As already mentioned, there are many modifications of Selenium grid, for example, Zalenium.

4.CI/CD

Brief description of technology

The practice of continuous integration is quite popular in development and is on a par with version control systems. Regardless, I feel there is confusion in terminology. In this paragraph, I would like to describe 3 modifications of this technology from my point of view. On the Internet you can find many articles with different interpretations, and it is absolutely normal if your opinion differs. The most important thing is that you are on the same wavelength as your colleagues.

So, there are 3 terms: CI - Continuous Integration (continuous integration), CD - Continuous Delivery (continuous delivery) and again CD - Continuous Deployment (continuous deployment). (In the following, I will use these terms in English.). Each modification adds a few extra steps to your development pipeline. But the word continuous (continuous) is the most important. In this context, we mean something that happens from start to finish, without interruption or manual intervention. Let's look at CI & CD and CD in this context.

  • Continuous Integration - this is the initial step of evolution. After submitting new code to the server, we expect to get quick feedback that everything is fine with our changes. Usually CI includes running static code analysis tools and unit/internal API tests. This allows us to get information about our code in a few seconds/minutes later.
  • Continuous Delivery is a more advanced step where we run integration/UI tests. However, at this stage, we are not getting results as quickly as in the case of CI. First, these types of tests require more time to pass. Secondly, before launching, we must deploy our changes to the test / staging environment. Moreover, if we are talking about mobile development, then there is an additional step to create an assembly of our application.
  • continuous deployment assumes that we automatically release (release) our changes to production if all acceptance tests have been passed in the previous stages. In addition to this, after the release stage, you can set up various stages, such as running smoke tests on production and collecting interesting metrics. Continuous Deployment is only possible with good coverage with automated tests. If some manual interventions are required, including testing, then this is no longer Continuous (continuous). Then we can say that our pipeline corresponds only to the practice of Continuous Delivery.

Value for automation infrastructure

In this section, I should clarify that when we talk about end-to-end UI tests, this implies that we should deploy our changes and related services to test environments. Continuous Integration - the process is not applicable for this task and we must take care of implementing at least Continuous Deliver practices. Continuous Deployment also makes sense in the context of UI tests if we are going to run them in production.

And before we look at the architecture change illustration, I want to say a few words about GitLab CI. Unlike other CI/CD tools, GitLab provides a remote repository and many other advanced features. Thus, GitLab is more than CI. It includes out of the box source code management, Agile management, CI/CD pipelines, logging tools, and metrics collections. The GitLab architecture consists of Gitlab CI/CD and GitLab Runner. Here is a short description from the official site:

Gitlab CI/CD is a web application with an API that stores its state in a database, manages projects/builds and provides a user interface. GitLab Runner is an application which processes builds. It can be deployed separately and works with GitLab CI/CD through an API. For tests running you need both Gitlab instance and Runner.

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools

5. Cloud platforms

Brief description of technology

In this section, we will talk about a popular trend called 'public clouds'. Despite the huge benefits that the virtualization and containerization technologies described above provide, we still need computing resources. Companies buy expensive servers or rent data centers, but in this case, it is necessary to make calculations (sometimes unrealistic) of how many resources we will need, whether we will use them 24/7 and for what purposes. For example, production requires a server running around the clock, but do we need similar resources for testing after hours? It also depends on the type of testing being performed. An example would be load/stress tests that we plan to run during off-hours to get results the next day. But definitely, XNUMX/XNUMX server availability is not required for end-to-end auto-tests and especially for manual test environments. For such situations, it would be good to get as many resources as needed on demand, use them, and stop paying when they are no longer needed. Moreover, it would be great to get them instantly by making a few mouse clicks or running a couple of scripts. This is what public clouds are for. Let's look at the definition:

β€œThe public cloud is defined as computing services offered by third-party providers over the public Internet, making them available to anyone who wants to use or purchase them. They may be free or sold on-demand, allowing customers to pay only per usage for the CPU cycles, storage, or bandwidth they consume."

There is an opinion that public clouds are expensive. But their key idea is to reduce the company's costs. As mentioned earlier, public clouds allow you to get resources on demand and pay only for the time of their use. Also, sometimes we forget that employees are paid, and specialists are also an expensive resource. Keep in mind that public clouds make it much easier to maintain infrastructure, which allows engineers to focus on more important tasks. 

Value for automation infrastructure

What specific resources do we need for end-to-end UI tests? These are mostly virtual machines or clusters (we'll talk about Kubernetes in the next section) to run browsers and emulators. The more browsers and emulators we want to run at the same time, the more CPU and memory is required and the more money we have to pay for it. Thus, public clouds in the context of test automation allow us to run a large number (100, 200, 1000 ...) of browsers / emulators on demand, get test results as quickly as possible and stop paying for such insanely resource-intensive capacities. 

The most popular cloud providers are Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP). The how-to guide provides examples of how to use GCP, but in general it doesn't matter what you use for your automation tasks. They all provide roughly the same functionality. Typically, for provider selection, the guide focuses on the entire company infrastructure and business requirements, which is beyond the scope of this article. It will be more interesting for automation engineers to compare the use of cloud providers with the use of cloud platforms specifically for testing purposes, such as Sauce Labs, BrowserStack, BitBar and so on. So let's do it the same! In my opinion, Sauce Labs is the most famous cloud testing farm, which is why I took it for comparison. 

GCP vs Sauce Labs for automation purposes:

Imagine that we need to run 8 web tests and 8 Android tests at the same time. To do this, we will use GCP and run 2 virtual machines with Selenoid. On the first one, we will raise 8 containers with browsers. On the second - 8 containers with emulators. Let's take a look at the prices:  

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch
To run a single container with Chrome, we need n1-standard-1 car. In the case of Android it will be n1-standard-4 for one emulator. In fact, a more flexible and cheaper way is to set specific user values ​​\uXNUMXb\uXNUMXbfor CPU / Memory, but at the moment this is not important in comparison with Sauce Labs.

And here are the tariffs for using Sauce Labs:

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch
I believe you have already noticed the difference, but still I will give a table with calculations for our task:

Required resources
Montly
Working hours(8 am - 8 pm)
Working hours+ Preemptible

GCP for Web
n1-standard-1 x 8 = n1-standard-8
$194.18
23 days * 12h * 0.38 = $104.88 
23 days * 12h * 0.08 = $22.08

Sauce Labs for Web
Virtual Cloud8 parallel tests
$1.559
β€”
β€”

GCP for Android
n1-standard-4 x 8: n1-standard-16
$776.72
23 days * 12h * 1.52 = $419.52 
23 days * 12h * 0.32 = $88.32

Sauce Labs for Android
Real Device Cloud 8 parallel tests
$1.999
β€”
β€”

As you can see, the difference in cost is huge, especially if you run tests only during a working twelve-hour period. But you can cut costs even further if you use preemptible machines. What is it?

A preemptible VM is an instance that you can create and run at a muchower price than normal instances. However, Compute Engine might terminate (preempt) these instances if it requires access to those resources for other tasks. Preemptible instances are excess Compute Engine capacity, so their availability varies with usage.

If your apps are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on preemptible instances. If some of those instances terminate during processing, the job slows but does not completely stop. Preemptible instances complete your batch processing tasks without placing additional workload on your existing instances and without requiring you to pay full price for additional normal instances.

And it's still not over! In fact, I'm sure no one runs tests for 12 hours without a break. And if so, then you can automatically start and stop virtual machines when they are not needed. Actual usage time may be reduced to 6 hours per day. Then the payment in the context of our task will decrease right up to $ 11 per month for 8 browsers. Isn't it wonderful? But with preemptible machines, we must be careful and be prepared for interruptions and instability, although these situations can be foreseen and handled by software. It's worth it!

But by no means am I saying 'never use cloud test farms'. They have a number of advantages. First of all, this is not just a virtual machine, but a complete test automation solution with a set of functionality out of the box: remote access, logs, screenshots, video recording, various browsers and physical mobile devices. In many situations, it can be an indispensable chic alternative. Especially test platforms are useful for IOS automation when public clouds can only offer Linux/Windows systems. But the conversation about IOS will be in the following articles. I recommend always looking at the situation and starting from the tasks: in some it is cheaper and more efficient to use public clouds, and in some test platforms are definitely worth the money spent.

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools:

6. Orchestration

Brief description of technology

I have good news - we have almost reached the end of the article! At the moment, our automation infrastructure consists of web and Android tests, which we run through GitLab CI in parallel using Docker-enabled tools: Selenium grid and Selenoid. Moreover, we use virtual machines created through GCP to raise containers with browsers and emulators in them. To reduce costs, we start these virtual machines only on demand and stop them when not testing. Is there anything else that can improve our infrastructure? The answer is yes! Meet Kubernetes (K8s)!

First, let's look at how the words orchestration, cluster, and Kubernetes are related. At a high level, orchestration is the system that deploys and manages applications. To automate testing, such containerized applications are Selenium grid and Selenoid. Docker and K8s complement each other. The first is used for application deployment, the second for orchestration. In turn, K8s is a cluster. The task of the cluster is to use VMs as Nodes, which allows you to install various functionality, programs and services within a single server (cluster). If any of the Node falls, other Nodes will pick up, which ensures our application runs smoothly. In addition to this, K8s has an important functionality related to scaling, thanks to which we automatically get the optimal amount of resources based on the load and the limits set.

In truth, manually deploying Kubernetes from scratch is not a trivial task at all. I'll leave a link to the famous "Kubernetes The Hard Way" how-to guide, and if you're interested, you can practice. But, fortunately, there are alternative ways and tools. The easiest of these is to use the Google Kubernetes Engine (GKE) in GCP, which will allow you to get a ready-made cluster in a few clicks. I recommend using this approach to start learning, as it will allow you to focus on learning how to use K8s for your tasks instead of learning how internal components should be integrated with each other. 

Value for automation infrastructure

Let's take a look at a few notable features that K8s provides:

  • application deployment: using multi-nodes cluster instead of VMs;
  • dynamic scaling: reduces the cost of resources that are used only on demand;
  • self-healing: automatic recovery of pods (as a result of which containers are also restored);
  • rolling out updates and rollbacks without downtime: updating tools, browsers and emulators does not interrupt current users

But K8s are still not a silver bullet. To understand all the advantages and limitations in the context of the tools we are considering (Selenium grid, Selenoid), let's briefly discuss the structure of K8s. Cluster contains two types of Nodes: Master Nodes and Workers Nodes. Master Nodes are responsible for management, deployment and scheduling decisions. Workers nodes are where applications run. Nodes also contain the container runtime. In our case, this is Docker, which is responsible for operations related to containers. But there are alternative solutions, for example containerd. It is important to understand that scaling or self-healing is not directly related to containers. This is implemented by adding/reducing the number of pods, which in turn contain containers (usually one container per pod, but depending on the task, there may be more). The high-level hierarchy is worker nodes, inside of which there are pods, inside of which containers are raised.

The scaling feature is key and can be applied to both nodes within a cluster node-pool and pods within a node. There are 2 types of scaling that apply to both nodes and pods. The first type is horizontal - scaling occurs by increasing the number of nodes / pods. This type is preferred. The second type, respectively, is vertical. Scaling is done by increasing the size of nodes/pods, not their number.

Now consider our tools in the context of the above terms.

selenium grid

As mentioned earlier, Selenium grid is a very popular tool, and it's no surprise that it's been containerized. Therefore, it comes as no surprise that Selenium grid can be deployed in K8s. An example of how to do this can be found in the official K8s repository. As usual, I attach links at the end of the section. In addition to this, this practical guide shows you how to do it in Terraform's turn. There is also an instruction on how to scale the number of pods that contain containers with browsers. But the autoscaling feature in the context of K8s is still not an entirely obvious task. When I started studying, I did not find any practical guide or recommendations. After some research and experimentation with the support of the DevOps team, we chose the approach of raising containers with the necessary browsers inside one pod, which is located inside one worker node. This method allows us to apply the horizontal scaling strategy of nodes by increasing their number. I hope that in the future the situation will change and we will see more and more descriptions of the best approaches and ready-made solutions, especially after the release of Selenium grid 4 with a changed internal architecture.

Selenoid:

Currently, the deployment of Selenoid in K8s is the biggest disappointment. They are not compatible. Theoretically, we can raise a Selenoid container inside a pod, but when Selenoid starts running containers with browsers, they will still be inside the same pod. This makes scaling impossible and, as a result, Selenoid operation inside a cluster will not differ from operation inside a virtual machine. End of story.

Moon:

Knowing this bottleneck when working with Selenoid, the developers released a more powerful tool called Moon. This tool was originally conceived to work with Kubernetes and as a result, the autoscaling feature can and should be used. Moreover, I would say that at the moment it is only a tool in the Selenium world that has native K8s cluster support out of the box (not anymore, see next tool ). The key feature of Moon that provides this support is: 

Completely stateless. Selenoid stores in memory information about currently running browser sessions. If for some reason its process crashes - then all running sessions are lost. Moon contrarily has no internal state and can be replicated across data centers. Browser sessions remain alive even if one or more replicas go down.

So, Moon is a great solution, but with one problem, it's not free. The price depends on the number of sessions. You can only run 0-4 sessions for free, which is not very useful. But, starting from the fifth session, you will have to pay $ 5 for each. The situation may differ from company to company, but in our case, using Moon is meaningless. As I described above, we can start VMs with Selenium Grid on demand or increase the number of Nodes in the cluster. Approximately in one pipeline we launch 500 browsers and stop all resources after the tests are completed. If we used Moon, we would have to pay an additional 500 x 5 = $2500 per month no matter how often we run tests. And again, I'm not saying "don't use Moon". For your tasks, this can be an indispensable solution, for example, if you have many projects / teams in your organization and you need a huge common cluster for everyone. As always, I leave a link at the end and recommend that you do all the necessary calculations in the context of your task.

Callisto: (Attention! This is not in the original article and is contained only in the Russian translation)

As I said, Selenium is a very popular tool and the IT industry is growing very fast. While I was working on the translation, a promising new tool Callisto appeared on the web (hello Cypress and other Selenium killers). It works natively with K8s and allows you to run Selenoid containers in pods distributed across Nodes. Everything works out of the box, including autoscaling. Fantastic, but needs to be tested. I have already managed to deploy this tool and put some experiments. But it’s too early to draw conclusions, after receiving results over a long distance, perhaps I will make a review in future articles. For now, I leave only links for independent research.  

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study

Similar tools

7. Infrastructure as code (IaC)

Brief description of technology

And now we come to the last section. Typically, this technology and related tasks are not the responsibility of automation engineers. And there are reasons for this. Firstly, in many organizations, infrastructure issues are under the control of the DevOps department and development teams do not really care about what makes the pipeline work and how everything related to it needs to be supported. Second, let's be honest, Infrastructure as Code (IaC) practices are still not implemented in many companies. But it has definitely become a popular trend and it is important to try to be involved in the processes, approaches and tools involved. Or at least stay up to date.

Let's start with the motivation for using this approach. We have already discussed that in order to run tests in GitlabCI, we will need at least the resources to run Gitlab Runner. And to run containers with browsers / emulators, we need to reserve a VM or a cluster. In addition to testing resources, we need a significant amount of capacity to support development, staging, production environments, which also includes databases, automatic schedules, network configurations, load balancer, user rights, and so on. The key issue is the effort required to support it all. There are several ways in which we can make changes and roll out updates. For example, in the context of GCP, we can use the UI console in the browser and perform all actions by clicking buttons. An alternative way could be to use API calls to interact with cloud entities or use the gcloud command line utility to perform the necessary manipulations. But with a really large number of different entities and infrastructure elements, it becomes difficult or even impossible to perform all operations manually. Moreover, all these manual actions are uncontrollable. We cannot submit them for review before execution, use a version control system and quickly roll back the edits that led to the incident. To solve such problems, engineers have created and create automatic bash / shell scripts, which is not much better than the previous methods, since they are not so easy to quickly read, understand, maintain and modify in a procedural style.

In this article and practical guide, I use 2 tools related to the practice of IaC. These are Terraform and Ansible. Some believe that it does not make sense to use them at the same time, since their functionality is similar and they are interchangeable. But the fact is that initially they have completely different tasks. And the fact that these tools should complement each other was confirmed at a joint presentation by developers representing HashiCorp and RedHat. The conceptual difference is that Terraform is a provisioning tool for managing the servers themselves. Whereas Ansible is a configuration management tool whose task is to install, configure and manage software on these servers.

Another key distinguishing feature of these tools is the coding style. Unlike bash and Ansible, Terraform uses a declarative style based on a description of the desired end state to be reached as a result of execution. For example, if we are going to create 10 VMs and apply changes through Terraform, then we will get 10 VMs. If we apply the script again, nothing will happen, since we already have 10 VMs, and Terraform knows about this, because it stores the current state of the infrastructure in the state file. But Ansible uses a procedural approach and if you ask it to create 10 VMs, then on the first run we will get 10 VMs, similar to Terraform. But after restarting, we will already have 20 VMs. Therein lies the important difference. In procedural style, we do not store the current state and simply describe the sequence of steps to be performed. Of course, we can handle different situations, add some checks for the existence of resources and the current state, but there is no point in wasting our time and effort on controlling this logic. It also increases the risk of making mistakes. 

Summarizing all of the above, we can conclude that Terraform and declarative notation are a more suitable tool for provisioning servers. But it is better to delegate the work of managing configurations to Ansible. With that out of the way, let's look at use cases in the context of automation.

Value for automation infrastructure

Here it is important to understand only that the test automation infrastructure should be considered as part of the entire infrastructure of the company. And this means that all IaC practices must be applied globally to the resources of the entire organization. Who is responsible for this depends on your processes. DevOps team is more experienced in these issues, they see the whole picture of what is happening. However, QA engineers are more involved in the process of building automation and the structure of the pipeline, which allows them to better see all the required changes and opportunities for improvement. The best option is to work together, share knowledge and ideas to achieve the expected result. 

Here are a few examples of using Terraform and Ansible in the context of test automation and the tools we discussed before:

1. Describe the necessary characteristics and parameters of VMs and clusters through Terraform.

2. Using Ansible, install the tools necessary for testing: docker, Selenoid, Selenium Grid and download the required versions of browsers/emulators.

3. Describe via Terraform the characteristics of the VM in which the GitLab Runner will be launched.

4. Install using Ansible GitLab Runner and the necessary related tools, set settings and configurations.

Illustration of the current state of the infrastructure

DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

Links to study:

Similar tools

Let's summarize!

Step
Technology
Tools
Value for automation infrastructure

1
Local running
Node.js, Selenium, Appium

  • The most popular tools for web and mobile
  • Support for many languages ​​and platforms (including Node.js)

2
Version control systems 
Go

  • Similar benefits with development code

3
Containerization
Docker, Selenium grid, Selenoid (Web, Android)

  • Running Tests in Parallel
  • Isolated environments
  • Easy, flexible version upgrades
  • Dynamic stop of unused resources
  • Easy to set up

4
CI/CD
Gitlab CI

  • Tests part of the pipeline
  • Fast feedback
  • Visibility to the entire company/team

5
Cloud platforms
Google Cloud Platform

  • Resources on demand (pay only when needed)
  • Easy to manage and update
  • Visibility and control of all resources

6
Orchestration
Kubernetes
In the context of containers with browsers/emulators inside pods:

  • Scaling / autoscaling
  • Self-healing
  • Updates and rollbacks without interruption

7
Infrastructure as a code (IaC)
Terraform, Ansible

  • Similar benefits with development infrastructure
  • All the benefits of code versioning
  • Easy to change and maintain
  • Fully automated

Mind map diagrams: the evolution of infrastructure

step1: Local
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step2: VCS
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step3: Containerization 
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step4: CI/CD 
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step5: Cloud Platforms
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step6:Orchestration
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

step7: IaC
DevOps tools are not just for DevOps. The process of building a test automation infrastructure from scratch

What's next?

So, this is the end of the article. But in conclusion, I would like to establish some agreements with you.

From your side
As I said at the beginning, I would like the article to be of practical use and help you apply the acquired knowledge in real work. I add again link to practice guide.

But even after that, do not stop, practice, study the relevant links and books, find out how it works in your company, find places that can be improved and take part in this. Good luck!

From my side

As you can see from the title, this was only the first part. Despite the fact that it turned out to be quite large, important topics are still not covered here. In the second part, I plan to look at the automation infrastructure in the context of IOS. Due to Apple restrictions on running IOS simulators on macOS systems only, our set of solutions is narrowed down. For example, we are unable to use Docker to run a simulator or public clouds to run virtual machines. But that doesn't mean there aren't other alternatives. I will try to keep you up to date with cutting-edge solutions and modern tools!

Also, I didn't mention the rather large topics related to monitoring. In Part 3, I'm going to look at the most popular infrastructure monitoring tools and what data and metrics to consider.

And finally. In the future, I plan to release a video course on building a test infrastructure and popular tools. There are quite a lot of courses and lectures on DevOps on the Internet at present, but all the materials are presented in the context of development, but not test automation. In this matter, I really need feedback on whether such a course would be interesting and valuable for the community of testers and automators. Thank you in advance!

Source: habr.com

Add a comment