ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

ሰላም ሁላችሁም። በግንቦት OTUS ይጀምራል በክትትል እና በመመዝገብ ላይ አውደ ጥናት, Zabbix, Prometheus, Grafana እና ELK በመጠቀም ሁለቱም መሠረተ ልማት እና መተግበሪያዎች. በዚህ ረገድ, በተለምዶ በርዕሱ ላይ ጠቃሚ ቁሳቁሶችን እናካፍላለን.

ብላክቦክስ ላኪ ለ Prometheus ውጫዊ አገልግሎቶችን በ HTTP፣ HTTPS፣ DNS፣ TCP፣ ICMP በኩል እንዲከታተሉ ይፈቅድልዎታል። በዚህ ጽሁፍ የብላክቦክስ ላኪን በመጠቀም HTTP/HTTPS ክትትልን እንዴት እንደሚያዋቅሩ አሳያችኋለሁ። የ Blackbox ላኪውን በኩበርኔትስ ውስጥ እናስኬዳለን።

አከባቢው

የሚከተሉትን እንፈልጋለን:

  • ኩባንያቶች
  • ፕሮሜቲየስ ኦፕሬተር

ብላክቦክስ ላኪ ውቅር

ብላክቦክስን በ በኩል በማዋቀር ላይ ConfigMap ለቅንብሮች http የድር አገልግሎቶች ክትትል ሞዱል.

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-blackbox-exporter
  labels:
    app: prometheus-blackbox-exporter
data:
  blackbox.yaml: |
    modules:
      http_2xx:
        http:
          no_follow_redirects: false
          preferred_ip_protocol: ip4
          valid_http_versions:
          - HTTP/1.1
          - HTTP/2
          valid_status_codes: []
        prober: http
        timeout: 5s

ሞዱል http_2xx የድር አገልግሎት 2xx HTTP ሁኔታ ኮድ እየመለሰ መሆኑን ለማረጋገጥ ይጠቅማል። የብላክቦክስ ላኪው ውቅር በበለጠ ዝርዝር ውስጥ ተገልጿል ሰነድ.

ብላክቦክስ ላኪን ወደ ኩበርኔትስ ክላስተር አሰማር

ይግለጹ Deployment и Service በ Kubernetes ውስጥ ለማሰማራት.

---
kind: Service
apiVersion: v1
metadata:
  name: prometheus-blackbox-exporter
  labels:
    app: prometheus-blackbox-exporter
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 9115
      protocol: TCP
  selector:
    app: prometheus-blackbox-exporter

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-blackbox-exporter
  labels:
    app: prometheus-blackbox-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-blackbox-exporter
  template:
    metadata:
      labels:
        app: prometheus-blackbox-exporter
    spec:
      restartPolicy: Always
      containers:
        - name: blackbox-exporter
          image: "prom/blackbox-exporter:v0.15.1"
          imagePullPolicy: IfNotPresent
          securityContext:
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          args:
            - "--config.file=/config/blackbox.yaml"
          resources:
            {}
          ports:
            - containerPort: 9115
              name: http
          livenessProbe:
            httpGet:
              path: /health
              port: http
          readinessProbe:
            httpGet:
              path: /health
              port: http
          volumeMounts:
            - mountPath: /config
              name: config
        - name: configmap-reload
          image: "jimmidyson/configmap-reload:v0.2.2"
          imagePullPolicy: "IfNotPresent"
          securityContext:
            runAsNonRoot: true
            runAsUser: 65534
          args:
            - --volume-dir=/etc/config
            - --webhook-url=http://localhost:9115/-/reload
          resources:
            {}
          volumeMounts:
            - mountPath: /etc/config
              name: config
              readOnly: true
      volumes:
        - name: config
          configMap:
            name: prometheus-blackbox-exporter

ብላክቦክስ ላኪው በሚከተለው ትዕዛዝ ሊሰማራ ይችላል። የስም ቦታ monitoring ፕሮሜቲየስ ኦፕሬተርን ያመለክታል.

kubectl --namespace=monitoring apply -f blackbox-exporter.yaml

የሚከተለውን ትዕዛዝ በመጠቀም ሁሉም አገልግሎቶች እየሰሩ መሆናቸውን ያረጋግጡ፡

kubectl --namespace=monitoring get all --selector=app=prometheus-blackbox-exporter

ብላክቦክስ ቼክ

ብላክቦክስ ላኪ የድር በይነገጽን መጠቀም ትችላለህ port-forward:

kubectl --namespace=monitoring port-forward svc/prometheus-blackbox-exporter 9115:9115

ብላክቦክስ ላኪ የድር በይነገጽን በድር አሳሽ በኩል ያገናኙ localhost: 9115.

ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

ብትሄድ http://localhost:9115/probe?module=http_2xx&target=https://www.google.comየተገለጸውን ዩአርኤል የማጣራት ውጤት ታያለህ (https://www.google.com).

ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

የመለኪያ እሴት probe_success እኩል 1 ማለት የተሳካ ቼክ ማለት ነው። የ0 ዋጋ ስህተትን ያሳያል።

ፕሮሜቲየስን በማዘጋጀት ላይ

ብላክቦክስ ላኪውን ካሰማራ በኋላ ፕሮሜቲየስን ወደ ውስጥ ያቀናብሩ prometheus-additional.yaml.

- job_name: 'kube-api-blackbox'
  scrape_interval: 1w
  metrics_path: /probe
  params:
    module: [http_2xx]
  static_configs:
   - targets:
      - https://www.google.com
      - http://www.example.com
      - https://prometheus.io
  relabel_configs:
   - source_labels: [__address__]
     target_label: __param_target
   - source_labels: [__param_target]
     target_label: instance
   - target_label: __address__
     replacement: prometheus-blackbox-exporter:9115 # The blackbox exporter.

እናመነጫለን። Secretየሚከተለውን ትዕዛዝ በመጠቀም.

PROMETHEUS_ADD_CONFIG=$(cat prometheus-additional.yaml | base64)
cat << EOF | kubectl --namespace=monitoring apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: additional-scrape-configs
type: Opaque
data:
  prometheus-additional.yaml: $PROMETHEUS_ADD_CONFIG
EOF

አመልክት additional-scrape-configs ለፕሮሜቲየስ ኦፕሬተር በመጠቀም additionalScrapeConfigs.

kubectl --namespace=monitoring edit prometheuses k8s
...
spec:
  additionalScrapeConfigs:
    key: prometheus-additional.yaml
    name: additional-scrape-configs

ወደ Prometheus የድር በይነገጽ እንሄዳለን, መለኪያዎችን እና ግቦቹን ያረጋግጡ.

kubectl --namespace=monitoring port-forward svc/prometheus-k8s 9090:9090

ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

የብላክቦክስን መለኪያዎች እና ግቦች እናያለን።

ለማሳወቂያዎች ደንቦችን ማከል (ማንቂያ)

ከ Blackbox ላኪ ማንቂያዎችን ለመቀበል፣ ወደ ፕሮሜቲየስ ኦፕሬተር ደንቦችን እንጨምር።

kubectl --namespace=monitoring edit prometheusrules prometheus-k8s-rules
...
  - name: blackbox-exporter
    rules:
    - alert: ProbeFailed
      expr: probe_success == 0
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "Probe failed (instance {{ $labels.instance }})"
        description: "Probe failedn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: SlowProbe
      expr: avg_over_time(probe_duration_seconds[1m]) > 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Slow probe (instance {{ $labels.instance }})"
        description: "Blackbox probe took more than 1s to completen  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: HttpStatusCode
      expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "HTTP Status Code (instance {{ $labels.instance }})"
        description: "HTTP status code is not 200-399n  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: SslCertificateWillExpireSoon
      expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "SSL certificate will expire soon (instance {{ $labels.instance }})"
        description: "SSL certificate expires in 30 daysn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: SslCertificateHasExpired
      expr: probe_ssl_earliest_cert_expiry - time()  <= 0
      for: 5m
      labels:
        severity: error
      annotations:
        summary: "SSL certificate has expired (instance {{ $labels.instance }})"
        description: "SSL certificate has expired alreadyn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: HttpSlowRequests
      expr: avg_over_time(probe_http_duration_seconds[1m]) > 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "HTTP slow requests (instance {{ $labels.instance }})"
        description: "HTTP request took more than 1sn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"
    - alert: SlowPing
      expr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Slow ping (instance {{ $labels.instance }})"
        description: "Blackbox ping took more than 1sn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"

በፕሮሜቲየስ ድር በይነገጽ ውስጥ ወደ ሁኔታ => ደንቦች ይሂዱ እና ለጥቁር ቦክስ-ላኪው ማንቂያ ደንቦችን ያግኙ።

ፕሮሜቴየስ፡ የኤችቲቲፒ ክትትል በብላክቦክስ ላኪ

Kubernetes API አገልጋይ SSL ሰርተፍኬት የሚያበቃበት ማሳወቂያዎችን በማዋቀር ላይ

የኩበርኔትስ ኤፒአይ አገልጋይ SSL ሰርተፍኬት የማለፊያ ክትትልን እናዋቅር። በሳምንት አንድ ጊዜ ማሳወቂያዎችን ይልካል.

ለ Kubernetes API አገልጋይ ማረጋገጫ የብላክቦክስ ላኪ ሞጁሉን ማከል።

kubectl --namespace=monitoring edit configmap prometheus-blackbox-exporter
...
      kube-api:
        http:
          method: GET
          no_follow_redirects: false
          preferred_ip_protocol: ip4
          tls_config:
            insecure_skip_verify: false
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          valid_http_versions:
          - HTTP/1.1
          - HTTP/2
          valid_status_codes: []
        prober: http
        timeout: 5s

የ Prometheus scrape ውቅር በማከል ላይ

- job_name: 'kube-api-blackbox'
  metrics_path: /probe
  params:
    module: [kube-api]
  static_configs:
   - targets:
      - https://kubernetes.default.svc/api
  relabel_configs:
   - source_labels: [__address__]
     target_label: __param_target
   - source_labels: [__param_target]
     target_label: instance
   - target_label: __address__
     replacement: prometheus-blackbox-exporter:9115 # The blackbox exporter.

የፕሮሜቲየስ ምስጢርን ይተግብሩ

PROMETHEUS_ADD_CONFIG=$(cat prometheus-additional.yaml | base64)
cat << EOF | kubectl --namespace=monitoring apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: additional-scrape-configs
type: Opaque
data:
  prometheus-additional.yaml: $PROMETHEUS_ADD_CONFIG
EOF

የማንቂያ ደንቦችን ማከል

kubectl --namespace=monitoring edit prometheusrules prometheus-k8s-rules
...
  - name: k8s-api-server-cert-expiry
    rules:
    - alert: K8sAPIServerSSLCertExpiringAfterThreeMonths
      expr: probe_ssl_earliest_cert_expiry{job="kube-api-blackbox"} - time() < 86400 * 90 
      for: 1w
      labels:
        severity: warning
      annotations:
        summary: "Kubernetes API Server SSL certificate will expire after three months (instance {{ $labels.instance }})"
        description: "Kubernetes API Server SSL certificate expires in 90 daysn  VALUE = {{ $value }}n  LABELS: {{ $labels }}"

ጠቃሚ አገናኞች

በ Docker ውስጥ መከታተል እና መግባት

ምንጭ: hab.com