Ajiye akan farashin girgije na Kubernetes akan AWS

An shirya fassarar labarin a jajibirin fara karatun "Tsarin kayan aikin da ya danganci Kubernetes".

Ajiye akan farashin girgije na Kubernetes akan AWS

Yadda ake ajiyewa akan farashin girgije lokacin aiki tare da Kubernetes? Babu mafita guda ɗaya da ta dace, amma wannan labarin yana bayyana kayan aikin da yawa waɗanda zasu iya taimaka muku sarrafa albarkatun ku yadda ya kamata da rage farashin lissafin girgije.

Na rubuta wannan labarin tare da Kubernetes don AWS a zuciya, amma zai yi amfani da (kusan) daidai wannan hanya ga sauran masu samar da girgije. Ina tsammanin gungu(s) ɗinku sun riga sun daidaita tsarin sikelin atomatik (cluster-autoscaler). Cire albarkatu da rage yawan aikin da kuke yi zai cece ku kuɗi kawai idan har ma yana rage rukunin ma'aikatan ku (misali EC2).

Wannan labarin zai ƙunshi:

Share albarkatun da ba a yi amfani da su ba

Yin aiki a cikin yanayi mai sauri yana da kyau. Muna son kungiyoyin fasaha hanzarta. Isar da software cikin sauri kuma yana nufin ƙarin tura aikin PR, yanayin samfoti, samfuri, da mafita na nazari. Ana tura komai akan Kubernetes. Wanene ke da lokacin da zai share jigilar gwaji da hannu? Yana da sauƙi a manta game da share gwaji na sati. Lissafin girgije zai ƙare yana tashi saboda wani abu da muka manta don rufewa:

Ajiye akan farashin girgije na Kubernetes akan AWS

(Henning Jacobs:
Zhiza:
Corey Quinn:
Labari: Asusunku na AWS aiki ne na adadin masu amfani da kuke da su.
Gaskiya: Makin AWS ɗin ku aiki ne na adadin injiniyoyin da kuke da su.

Ivan Kurnosov (a mayar da martani):
Gaskiyar gaskiya: Makin AWS ɗin ku aiki ne na adadin abubuwan da kuka manta don kashewa/share.)

Kubernetes Janitor (kube-janitor) yana taimakawa tsaftace tarin ku. Tsarin janitor yana da sassauƙa don amfanin duniya da na gida:

  • Dokokin fa'ida na tari na iya ayyana matsakaicin matsakaicin lokaci-zuwa-rayuwa (TTL) don jigilar PR/gwaji.
  • Ana iya bayanin albarkatun mutum ɗaya tare da janitor/ttl, misali don cire karu/samfurin ta atomatik bayan kwanaki 7.

An bayyana ƙa'idodi na gaba ɗaya a cikin fayil ɗin YAML. Hanyarsa tana wucewa ta cikin siga --rules-file in kube-janitor. Anan akwai ƙa'idar misali don cire duk wuraren suna da -pr- da sunan bayan kwana biyu:

- id: cleanup-resources-from-pull-requests
  resources:
    - namespaces
  jmespath: "contains(metadata.name, '-pr-')"
  ttl: 2d

Misalin da ke gaba yana tsara amfani da alamar aikace-aikacen akan faifan Aiki da StatefulSet don duk sabbin Ayyuka/StatefulSet a cikin 2020, amma a lokaci guda yana ba da damar aiwatar da gwaje-gwaje ba tare da wannan alamar ba har tsawon mako guda:

- id: require-application-label
  # удалить deployments и statefulsets без метки "application"
  resources:
    - deployments
    - statefulsets
  # см. http://jmespath.org/specification.html
  jmespath: "!(spec.template.metadata.labels.application) && metadata.creationTimestamp > '2020-01-01'"
  ttl: 7d

Gudun demo iyakataccen lokaci na mintuna 30 akan gungu mai gudana kube-janitor:

kubectl run nginx-demo --image=nginx
kubectl annotate deploy nginx-demo janitor/ttl=30m

Wani tushen karuwar farashi shine ƙididdiga masu tsayi (AWS EBS). Share Kubernetes StatefulSet baya share juzu'in sa (PVC - PersistentVolumeClaim). Ƙididdigar EBS da ba a yi amfani da su ba na iya haifar da farashi na ɗaruruwan daloli cikin sauƙi a kowane wata. Kubernetes Janitor yana da fasali don tsaftace PVCs da ba a yi amfani da su ba. Misali, wannan doka za ta cire duk PVCs waɗanda ba a ɗora su ta wani tsari ba kuma waɗanda StatefulSet ko CronJob ba a ambata ba:

# удалить все PVC, которые не смонтированы и на которые не ссылаются StatefulSets
- id: remove-unused-pvcs
  resources:
  - persistentvolumeclaims
  jmespath: "_context.pvc_is_not_mounted && _context.pvc_is_not_referenced"
  ttl: 24h

Kubernetes Janitor zai iya taimaka muku tsaftace gunkin ku kuma ya hana farashin lissafin girgije daga tari a hankali. Don turawa da umarnin daidaitawa, bi README kube-janitor.

Rage sikelin a lokacin lokutan da ba aiki

Ana buƙatar tsarin gwaji da tsari don aiki kawai a lokutan kasuwanci kawai. Wasu aikace-aikacen samarwa, kamar kayan aikin baya/masu gudanarwa, suma suna buƙatar iyakantaccen samuwa kawai kuma ana iya kashe su cikin dare.

Kubernetes Downscaler (kube-downscaler) yana ba masu amfani da masu aiki damar rage tsarin a lokacin lokutan da ba aiki ba. Ƙaddamarwa da StatefulSets na iya yin ƙima zuwa kwafin sifili. Ana iya dakatar da CronJobs. An saita Kubernetes Downscaler don dukan gungu, ɗaya ko fiye da wuraren suna, ko albarkatun mutum ɗaya. Kuna iya saita ko dai "lokacin aiki" ko, akasin haka, "lokacin aiki". Misali, don rage ƙwanƙwasa kamar yadda zai yiwu a cikin dare da ƙarshen mako:

image: hjacobs/kube-downscaler:20.4.3
args:
  - --interval=30
  # не отключать компоненты инфраструктуры
  - --exclude-namespaces=kube-system,infra
  # не отключать kube-downscaler, а также оставить Postgres Operator, чтобы исключенными БД можно было управлять
  - --exclude-deployments=kube-downscaler,postgres-operator
  - --default-uptime=Mon-Fri 08:00-20:00 Europe/Berlin
  - --include-resources=deployments,statefulsets,stacks,cronjobs
  - --deployment-time-annotation=deployment-time

Anan ga jadawali don daidaita nodes ɗin ma'aikatan gungu a ƙarshen mako:

Ajiye akan farashin girgije na Kubernetes akan AWS

Ƙaddamarwa daga ~ 13 zuwa 4 nodes na ma'aikata tabbas yana haifar da bambanci a cikin lissafin AWS.

Amma idan ina buƙatar yin aiki a lokacin gungu "downtime" fa? Za a iya cire wasu abubuwan turawa na dindindin daga ƙima ta ƙara ƙasa/keɓe: bayanin gaskiya. Za'a iya cire ayyukan aiki na ɗan lokaci ta amfani da ƙasa/keɓe-har sai an yi bayani tare da cikakken tambarin lokaci a cikin sigar YYYY-MM-DD HH:MM (UTC). Idan ya cancanta, za a iya mayar da duka tarin ta hanyar tura kwafsa tare da bayanin downscaler/force-uptime, misali, ta ƙaddamar da nginx blank:

kubectl run scale-up --image=nginx
kubectl annotate deploy scale-up janitor/ttl=1h # удалить развертывание через час
kubectl annotate pod $(kubectl get pod -l run=scale-up -o jsonpath="{.items[0].metadata.name}") downscaler/force-uptime=true

Duba README kube-downscaler, idan kuna sha'awar umarnin turawa da ƙarin zaɓuɓɓuka.

Yi amfani da a kwance autoscaling

Yawancin aikace-aikace/aiyuka suna ma'amala da tsari mai ƙarfi: wani lokacin na'urorin su ba su da aiki, wani lokacin kuma suna aiki da cikakken ƙarfi. Yin aiki da runduna ta dindindin na kwas ɗin don jure matsakaicin nauyin nauyi ba tattalin arziki bane. Kubernetes yana goyan bayan sikeli ta atomatik a kwance a fadin albarkatu HorizontalPodAutoscaler (HPA). Amfani da CPU sau da yawa alama ce mai kyau don ƙima:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        averageUtilization: 100
        type: Utilization

Zalando ya ƙirƙiri wani yanki don haɗa ma'auni na al'ada don sauƙaƙe: Kube Metrics Adafta (kube-metrics-adapter) adaftar ma'auni ne na jerika don Kubernetes wanda zai iya tattarawa da kuma ba da sabis na al'ada da ma'auni na waje don a kwance autoscaling na pods. Yana goyan bayan sikeli bisa ma'aunin Prometheus, layukan SQS, da sauran saituna. Misali, don auna tura aikinku zuwa ma'auni na al'ada wanda aikace-aikacen kanta ke wakilta kamar yadda JSON ke amfani da / awo:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
  annotations:
    # metric-config.<metricType>.<metricName>.<collectorName>/<configKey>
    metric-config.pods.requests-per-second.json-path/json-key: "$.http_server.rps"
    metric-config.pods.requests-per-second.json-path/path: /metrics
    metric-config.pods.requests-per-second.json-path/port: "9090"
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: requests-per-second
      target:
        averageValue: 1k
        type: AverageValue

Daidaita sikelin autoscaling a kwance tare da HPA yakamata ya zama ɗaya daga cikin tsoffin ayyuka don haɓaka inganci don ayyukan marasa jiha. Spotify yana da gabatarwa tare da ƙwarewar su da shawarwarin HPA: auna kayan aikin ku, ba walat ɗin ku ba.

Rage overbooking albarkatun

Ayyukan Kubernetes suna ƙayyade bukatun CPU / ƙwaƙwalwar ajiya ta hanyar "buƙatun albarkatun." Ana auna albarkatun CPU a cikin nau'i-nau'i masu kama-da-wane ko fiye a cikin "millicores", misali 500m yana nufin 50% vCPU. Ana auna albarkatun ƙwaƙwalwar ajiya a cikin bytes, kuma ana iya amfani da suffixes gama gari, kamar 500Mi, wanda ke nufin megabytes 500. Abubuwan buƙatun ikon “kulle” akan nodes ɗin ma’aikata, ma’ana kwafsa mai buƙatun CPU na mita 1000 akan kumburi tare da vCPUs 4 zai bar vCPU 3 kawai ga sauran kwas ɗin. [1]

Slack ( ajiyar waje) shine bambanci tsakanin albarkatun da ake nema da ainihin amfani. Misali, kwas ɗin da ke buƙatar 2 GiB na ƙwaƙwalwar ajiya amma kawai yana amfani da 200 MiB yana da ~ 1,8 GiB na ƙwaƙwalwar "wucewa". Wuce kima yana kashe kuɗi. Mutum na iya ƙididdigewa cewa 1 GiB na ƙarancin ƙwaƙwalwar ajiya yana kashe $ 10 kowace wata. [2]

Rahoton Albarkatun Kubernetes (kube-resource-report) yana nuna ajiyar kuɗi da yawa kuma zai iya taimaka muku ƙayyade yuwuwar tanadi:

Ajiye akan farashin girgije na Kubernetes akan AWS

Rahoton Albarkatun Kubernetes yana nuna wuce gona da iri da aka haɗa ta aikace-aikace da umarni. Wannan yana ba ku damar nemo wuraren da za a iya rage buƙatun albarkatun. Rahoton HTML da aka ƙirƙira yana ba da hoto ne kawai na amfani da albarkatu. Ya kamata ku kalli CPU/amfani da ƙwaƙwalwar ajiya akan lokaci don tantance isassun buƙatun albarkatu. Anan ga ginshiƙi na Grafana don sabis na "na al'ada" na CPU: duk kwas ɗin suna amfani da ƙasa da 3 da ake buƙata na CPU:

Ajiye akan farashin girgije na Kubernetes akan AWS

Rage buƙatar CPU daga 3000m zuwa ~ 400m yana 'yantar da albarkatu don sauran nauyin aiki kuma yana ba da damar gungu ya zama ƙarami.

"Matsakaicin amfani da CPU na lokutan EC2 sau da yawa yana shawagi a cikin kewayon adadin lambobi ɗaya," Corey Quinn ya rubuta. Rahoton da aka ƙayyade na EC2 kimanta girman daidai zai iya zama mummunan yanke shawaraCanza wasu tambayoyin albarkatun Kubernetes a cikin fayil ɗin YAML abu ne mai sauƙi kuma yana iya kawo babban tanadi.

Amma muna son da gaske mutane su canza dabi'u a cikin fayilolin YAML? A'a, inji zai iya yin shi da kyau! Kubernetes A tsaye Pod Autoscaler (VPA) yana yin haka kawai: yana daidaita buƙatun albarkatu da ƙuntatawa gwargwadon nauyin aiki. Anan ga misalin misalin buƙatun Prometheus CPU (layin shuɗi mai shuɗi) wanda VPA ta daidaita akan lokaci:

Ajiye akan farashin girgije na Kubernetes akan AWS

Zalando yana amfani da VPA a duk gungu domin kayayyakin more rayuwa sassa. Aikace-aikace marasa mahimmanci kuma suna iya amfani da VPA.

Goldilocks daga Fairwind kayan aiki ne wanda ke ƙirƙirar VPA don kowane turawa a cikin filin suna sannan ya nuna shawarar VPA akan dashboard ɗin sa. Zai iya taimaka wa masu haɓakawa saita madaidaicin buƙatun CPU/ ƙwaƙwalwar ajiya don aikace-aikacen su:

Ajiye akan farashin girgije na Kubernetes akan AWS

Na rubuta karami blogpost game da VPA a cikin 2019, kuma kwanan nan a CNCF Ƙarshen User Community sun tattauna batun VPA.

Amfani da EC2 Spot Misalai

Ƙarshe amma ba kalla ba, ana iya rage farashin AWS EC2 ta amfani da lokuta Spot kamar yadda Kubernetes nodes na ma'aikaci. [3]. Ana samun misalin tabo akan rangwame har zuwa 90% idan aka kwatanta da farashin Buƙatu. Gudun Kubernetes akan EC2 Spot haɗin gwiwa ne mai kyau: kuna buƙatar ƙididdige nau'ikan misalai daban-daban don samun mafi girma, ma'ana zaku iya samun kumburi mafi girma akan farashi ɗaya ko ƙasa da ƙasa, kuma ana iya amfani da haɓakar ƙarfin ta kwantena na Kubernetes.

Yadda ake gudanar da Kubernetes akan EC2 Spot? Akwai zaɓuɓɓuka da yawa: yi amfani da sabis na ɓangare na uku kamar SpotInst (yanzu ana kiranta "Spot", kar ku tambaye ni dalili), ko kawai ƙara Spot AutoScalingGroup (ASG) zuwa tarin ku. Misali, ga snippet na CloudFormation don "ingantacciyar ƙarfin" Spot ASG tare da nau'ikan misalai da yawa:

MySpotAutoScalingGroup:
 Properties:
   HealthCheckGracePeriod: 300
   HealthCheckType: EC2
   MixedInstancesPolicy:
     InstancesDistribution:
       OnDemandPercentageAboveBaseCapacity: 0
       SpotAllocationStrategy: capacity-optimized
     LaunchTemplate:
       LaunchTemplateSpecification:
         LaunchTemplateId: !Ref LaunchTemplate
         Version: !GetAtt LaunchTemplate.LatestVersionNumber
       Overrides:
         - InstanceType: "m4.2xlarge"
         - InstanceType: "m4.4xlarge"
         - InstanceType: "m5.2xlarge"
         - InstanceType: "m5.4xlarge"
         - InstanceType: "r4.2xlarge"
         - InstanceType: "r4.4xlarge"
   LaunchTemplate:
     LaunchTemplateId: !Ref LaunchTemplate
     Version: !GetAtt LaunchTemplate.LatestVersionNumber
   MinSize: 0
   MaxSize: 100
   Tags:
   - Key: k8s.io/cluster-autoscaler/node-template/label/aws.amazon.com/spot
     PropagateAtLaunch: true
     Value: "true"

Wasu bayanan kula akan amfani da Spot tare da Kubernetes:

  • Kuna buƙatar sarrafa ƙarewar Spot, misali ta hanyar haɗa kumburi lokacin da aka dakatar da misalin
  • Zalando yana amfani cokali mai yatsa autoscaling gungu na hukuma tare da fifikon tafkin node
  • Spot nodes ana iya tilastawa karbi "rejista" na nauyin aiki don aiki a Spot

Takaitaccen

Ina fatan za ku sami wasu kayan aikin da aka gabatar suna da amfani wajen rage lissafin girgijenku. Kuna iya samun yawancin abubuwan da ke cikin labarin kuma a maganata a DevOps Gathering 2019 akan YouTube da cikin nunin faifai.

Wadanne ayyuka mafi kyawun ku don adana farashin girgije akan Kubernetes? Don Allah a sanar da ni a Twitter (@try_except_).

[1] A haƙiƙa, ƙasa da 3 vCPUs za su kasance masu amfani yayin da aka rage abin da ke cikin kumburin ta hanyar albarkatun tsarin da aka keɓe. Kubernetes ya bambanta tsakanin iyawar kumburin jiki da albarkatun "samar da" (Node Allocatable).

[2] Misalin ƙididdigewa: babban misali ɗaya m5. tare da 8 GiB na ƙwaƙwalwar ajiya shine ~ $84 ​​a kowane wata (eu-central-1, On-Demand), i.e. toshe kumburin 1/8 yana kusan ~ $10 / wata.

[3] Akwai ƙarin hanyoyi da yawa don rage lissafin ku na EC2, kamar Abubuwan da aka keɓance, Tsarin Savings, da sauransu - Ba zan rufe waɗannan batutuwa anan ba, amma ya kamata ku bincika su!

Koyi game da kwas.

source: www.habr.com

Add a comment