Back to microservices with Istio. Part 2

Back to microservices with Istio. Part 2

Note. transl.: The first part this cycle was devoted to getting acquainted with the capabilities of Istio and demonstrating them in action. Now we will talk about more complex aspects of configuring and using this service mesh, and in particular about finely tuned routing and network traffic management.

We also remind you that the article uses configurations (manifests for Kubernetes and Istio) from the repository istio-mastery.

traffic management

With Istio, new features are added to the cluster to provide:

  • Dynamic request routing: canary rollouts, A/B testing;
  • Load balancing: simple and consistent, based on hashes;
  • Recovery after falls: timeouts, retries, circuit breakers;
  • Introduction of faults: delays, interruption of requests, etc.

As the article continues, these features will be demonstrated using the selected application as an example, and new concepts will be introduced along the way. The first such concept will be DestinationRules (i.e. rules about the recipient of traffic / requests - approx. transl.), with which we activate A / B testing.

A/B Testing: DestinationRules in Practice

A/B testing is used when there are two versions of an application (usually they are visually different) and we are not 100% sure which one will improve the user experience. Therefore, we simultaneously launch both versions and collect metrics.

To deploy the second version of the frontend, which is required for the A/B testing demo, run the following command:

$ kubectl apply -f resource-manifests/kube/ab-testing/sa-frontend-green-deployment.yaml
deployment.extensions/sa-frontend-green created

The deployment manifest for the "green version" differs in two places:

  1. The image is based on a different tag βˆ’ istio-green,
  2. Pods have a label version: green.

Because both deployments have the label app: sa-frontend, requests routed by the virtual service sa-external-services for service sa-frontend, will be redirected to all its instances and the load will be distributed by round-robin algorithm, which will lead to the following situation:

Back to microservices with Istio. Part 2
Requested files not found

These files were not found due to the fact that they are named differently in different versions of the application. Let's check it out:

$ curl --silent http://$EXTERNAL_IP/ | tr '"' 'n' | grep main
/static/css/main.c7071b22.css
/static/js/main.059f8e9c.js
$ curl --silent http://$EXTERNAL_IP/ | tr '"' 'n' | grep main
/static/css/main.f87cd8c9.css
/static/js/main.f7659dbb.js

It means that index.htmlrequesting one version of static files can be sent by the load balancer to pods that have a different version, where, for obvious reasons, such files do not exist. Therefore, in order for the application to work, we need to put a restriction: "the same version of the application that returned index.html must serve subsequent requestsΒ».

We'll achieve our goal with hash-based consistent load balancing. (Consistent Hash Load Balancing). In this case requests from the same client are sent to the same backend instance, for which a predefined property is used - for example, an HTTP header. Implemented using DestinationRules.

DestinationRules

After Virtual Service sent a request to the desired service, using DestinationRules we can define the policies that will be applied to traffic destined for instances of this service:

Back to microservices with Istio. Part 2
Traffic management with Istio resources

Note: The impact of Istio resources on network traffic is presented here in a simplified form for understanding. To be precise, the decision on which instance to send the request to is made by Envoy in the Ingress Gateway configured in CRD.

With Destination Rules, we can set up load balancing to use consistent hashes and ensure that the same service instance responds to the same user. The following configuration achieves this (destinationrule-sa-frontend.yaml):

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: sa-frontend
spec:
  host: sa-frontend
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpHeaderName: version   # 1

1 - the hash will be generated based on the contents of the HTTP header version.

Apply the configuration with the following command:

$ kubectl apply -f resource-manifests/istio/ab-testing/destinationrule-sa-frontend.yaml
destinationrule.networking.istio.io/sa-frontend created

Now run the command below and make sure you get the right files when you specify the title version:

$ curl --silent -H "version: yogo" http://$EXTERNAL_IP/ | tr '"' 'n' | grep main

Note: To add different values ​​in the header and test the results directly in the browser, you can use this extension to Chrome (or this for Firefox - approx. transl.).

In general, DestinationRules has more features in the field of load balancing - check the details in official documentation.

Before exploring VirtualService further, let's remove the "green version" of the application and the corresponding rule for the direction of traffic by executing the following commands:

$ kubectl delete -f resource-manifests/kube/ab-testing/sa-frontend-green-deployment.yaml
deployment.extensions β€œsa-frontend-green” deleted
$ kubectl delete -f resource-manifests/istio/ab-testing/destinationrule-sa-frontend.yaml
destinationrule.networking.istio.io β€œsa-frontend” deleted

Mirroring: Virtual Services in Practice

shadowing ("shielding") or Mirroring ("mirroring") is used in cases where we want to test a change in production without affecting end users: to do this, we duplicate ("mirror") requests to a second instance where the necessary changes are made, and look at the consequences. To put it simply, it's when your colleague picks the most critical issue and makes a pull request in the form of such a huge ball of dirt that no one can actually review it.

To test this scenario in action, let's create a second instance of SA-Logic with bugs (buggy) by running the following command:

$ kubectl apply -f resource-manifests/kube/shadowing/sa-logic-service-buggy.yaml
deployment.extensions/sa-logic-buggy created

And now let's run a command to make sure all instances with app=sa-logic they also have labels with the corresponding versions:

$ kubectl get pods -l app=sa-logic --show-labels
NAME                              READY   LABELS
sa-logic-568498cb4d-2sjwj         2/2     app=sa-logic,version=v1
sa-logic-568498cb4d-p4f8c         2/2     app=sa-logic,version=v1
sa-logic-buggy-76dff55847-2fl66   2/2     app=sa-logic,version=v2
sa-logic-buggy-76dff55847-kx8zz   2/2     app=sa-logic,version=v2

Service sa-logic targets pods with a label app=sa-logic, so all requests will be distributed among all instances:

Back to microservices with Istio. Part 2

… but we want requests to be routed to v1 instances and mirrored to v2 instances:

Back to microservices with Istio. Part 2

We achieve this through a VirtualService in combination with a DestinationRule, where the rules define the subsets and routes of the VirtualService to a particular subset.

Defining subsets in Destination Rules

Subsets (subsets) defined by the following configuration (sa-logic-subsets-destinationrule.yaml):

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: sa-logic
spec:
  host: sa-logic    # 1
  subsets:
  - name: v1        # 2
    labels:
      version: v1   # 3
  - name: v2
    labels:
      version: v2

  1. Host (host) defines that this rule applies only to cases where the route goes towards the service sa-logic;
  2. Names (name) subsets are used when routing to subset instances;
  3. Label (label) defines the key-value pairs that instances must match to become part of a subset.

Apply the configuration with the following command:

$ kubectl apply -f resource-manifests/istio/shadowing/sa-logic-subsets-destinationrule.yaml
destinationrule.networking.istio.io/sa-logic created

Now that the subsets are defined, we can go ahead and configure the VirtualService to apply rules to requests to sa-logic so that they:

  1. Routed to a subset v1,
  2. Mirrored to a subset v2.

The following manifest allows you to achieve your goals (sa-logic-subsets-shadowing-vs.yaml):

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: sa-logic
spec:
  hosts:
    - sa-logic          
  http:
  - route:
    - destination:
        host: sa-logic  
        subset: v1      
    mirror:             
      host: sa-logic     
      subset: v2

No explanation needed here, so let's just see it in action:

$ kubectl apply -f resource-manifests/istio/shadowing/sa-logic-subsets-shadowing-vs.yaml
virtualservice.networking.istio.io/sa-logic created

Let's add a load by calling the following command:

$ while true; do curl -v http://$EXTERNAL_IP/sentiment 
    -H "Content-type: application/json" 
    -d '{"sentence": "I love yogobella"}'; 
    sleep .8; done

Let's look at the results in Grafana, where you can see that the version with bugs (buggy) fails for ~60% of requests, but none of these failures affect end users as they are answered by a running service.

Back to microservices with Istio. Part 2
Success of responses of different versions of the sa-logic service

This is where we first saw how VirtualService is applied to our service Envoys: when sa-web-app makes a request to sa-logic, it passes through sidecar Envoy, which - through VirtualService - is configured to route the request to the v1 subset and mirror the request to the v2 subset of the service sa-logic.

I know: you already had time to think that Virtual Services are simple. In the next section, we will expand this opinion by saying that they are also truly great.

Canary Rollouts

Canary Deployment is the process of rolling out a new version of an application to a small number of users. It is used to make sure that there are no problems in the release, and only after that, already being sure of its sufficient (release) quality, distribute it toΠΎbigger audience.

To demonstrate canary rollouts, we will continue with a subset buggy Ρƒ sa-logic.

Let's not waste time on trifles and immediately send 20% of users to the version with bugs (it will represent our canary rollout), and the remaining 80% to a normal service. To do this, apply the following VirtualService (sa-logic-subsets-canary-vs.yaml):

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: sa-logic
spec:
  hosts:
    - sa-logic    
  http:
  - route: 
    - destination: 
        host: sa-logic
        subset: v1
      weight: 80         # 1
    - destination: 
        host: sa-logic
        subset: v2
      weight: 20 # 1

1 is the weight (weight) that specifies the percentage of requests that will be directed to the recipient, or a subset of the recipient.

Update the previous VirtualService configuration for sa-logic with the following command:

$ kubectl apply -f resource-manifests/istio/canary/sa-logic-subsets-canary-vs.yaml
virtualservice.networking.istio.io/sa-logic configured

... and we will immediately see that some of the requests lead to failures:

$ while true; do 
   curl -i http://$EXTERNAL_IP/sentiment 
   -H "Content-type: application/json" 
   -d '{"sentence": "I love yogobella"}' 
   --silent -w "Time: %{time_total}s t Status: %{http_code}n" 
   -o /dev/null; sleep .1; done
Time: 0.153075s Status: 200
Time: 0.137581s Status: 200
Time: 0.139345s Status: 200
Time: 30.291806s Status: 500

VirtualServices trigger canary rollouts: in this case, we've narrowed down the potential impact of problems to 20% of the user base. Wonderful! Now, in every case when we are not sure about our code (in other words, always ...), we can use mirroring and canary rollouts.

Timeouts and retries

But bugs don't always end up in the code. In the list from8 misconceptions in distributed computing” in the first place is the erroneous opinion that β€œthe network is reliable”. In reality the network not reliable, and for this reason we need timeouts (timeouts) and retries (retries).

For demonstration, we will continue to use the same problem version sa-logic (buggy), and the unreliability of the network will be simulated by random failures.

Let our buggy service have a 1/3 chance of responding too long, 1/3 of ending with an Internal Server Error, and 1/3 of successfully rendering the page.

To mitigate the impact of these issues and improve the lives of our users, we can:

  1. add a timeout if the service responds longer than 8 seconds,
  2. retry if the request fails.

For implementation, we use the following resource definition (sa-logic-retries-timeouts-vs.yaml):

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: sa-logic
spec:
  hosts:
    - sa-logic
  http:
  - route: 
    - destination: 
        host: sa-logic
        subset: v1
      weight: 50
    - destination: 
        host: sa-logic
        subset: v2
      weight: 50
    timeout: 8s           # 1
    retries:
      attempts: 3         # 2
      perTryTimeout: 3s # 3

  1. The timeout for the request is set to 8 seconds;
  2. Requests are retried 3 times;
  3. And each attempt is considered unsuccessful if the response time exceeds 3 seconds.

This is an optimization because the user does not have to wait more than 8 seconds and we will make three new attempts to get a response in case of failures, increasing the chance of a successful response.

Apply the updated configuration with the following command:

$ kubectl apply -f resource-manifests/istio/retries/sa-logic-retries-timeouts-vs.yaml
virtualservice.networking.istio.io/sa-logic configured

And check in the Grafana charts that the number of successful responses is over:

Back to microservices with Istio. Part 2
Improvements in success statistics after adding timeouts and retries

Before proceeding to the next section (more precisely - already to the next part of the article, because in this practical experiments there will be no more - approx. transl.), delete sa-logic-buggy and VirtualService by running the following commands:

$ kubectl delete deployment sa-logic-buggy
deployment.extensions β€œsa-logic-buggy” deleted
$ kubectl delete virtualservice sa-logic
virtualservice.networking.istio.io β€œsa-logic” deleted

Circuit Breaker and Bulkhead Patterns

We are talking about two important patterns in the microservice architecture that allow you to achieve self-restoration. (self-healing) services.

Circuit Breaker ("circuit breaker") used to stop requests coming in to a service instance that is considered unhealthy and restore it while client requests are redirected to healthy instances of that service (which increases the success rate). (Transl. note: A more detailed description of the pattern can be found, for example, here.)

bulkhead ("partition") isolates failures in services from the defeat of the entire system. For example, service B is broken, and another service (a client of service B) makes a request to service B, causing it to use up its thread pool and not be able to serve other requests (even if they do not belong to service B). (Transl. note: A more detailed description of the pattern can be found, for example, here.)

I will omit the implementation details of these patterns because they are easy to find in official documentation, and I also really want to show authentication and authorization, which will be discussed in the next part of the article.

PS from translator

Read also on our blog:

Source: habr.com

Add a comment