Kumusta tanan. Sa Mayo gilusad ang OTUS
Ang kalikopan
Kinahanglan namon ang mosunod:
- Kubernetes
- Operator sa Prometheus
Pag-configure sa blackbox sa exporter
Pag-configure sa Blackbox pinaagi sa ConfigMap
alang sa mga setting http
web services monitoring module.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-blackbox-exporter
labels:
app: prometheus-blackbox-exporter
data:
blackbox.yaml: |
modules:
http_2xx:
http:
no_follow_redirects: false
preferred_ip_protocol: ip4
valid_http_versions:
- HTTP/1.1
- HTTP/2
valid_status_codes: []
prober: http
timeout: 5s
Modyul http_2xx
gigamit sa pagsusi nga ang serbisyo sa web nagbalik ug HTTP 2xx status code. Ang configuration sa blackbox exporter gihulagway sa mas detalyado sa
Pag-deploy og blackbox exporter ngadto sa Kubernetes cluster
Ihulagway Deployment
ΠΈ Service
alang sa deployment sa Kubernetes.
---
kind: Service
apiVersion: v1
metadata:
name: prometheus-blackbox-exporter
labels:
app: prometheus-blackbox-exporter
spec:
type: ClusterIP
ports:
- name: http
port: 9115
protocol: TCP
selector:
app: prometheus-blackbox-exporter
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-blackbox-exporter
labels:
app: prometheus-blackbox-exporter
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-blackbox-exporter
template:
metadata:
labels:
app: prometheus-blackbox-exporter
spec:
restartPolicy: Always
containers:
- name: blackbox-exporter
image: "prom/blackbox-exporter:v0.15.1"
imagePullPolicy: IfNotPresent
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
args:
- "--config.file=/config/blackbox.yaml"
resources:
{}
ports:
- containerPort: 9115
name: http
livenessProbe:
httpGet:
path: /health
port: http
readinessProbe:
httpGet:
path: /health
port: http
volumeMounts:
- mountPath: /config
name: config
- name: configmap-reload
image: "jimmidyson/configmap-reload:v0.2.2"
imagePullPolicy: "IfNotPresent"
securityContext:
runAsNonRoot: true
runAsUser: 65534
args:
- --volume-dir=/etc/config
- --webhook-url=http://localhost:9115/-/reload
resources:
{}
volumeMounts:
- mountPath: /etc/config
name: config
readOnly: true
volumes:
- name: config
configMap:
name: prometheus-blackbox-exporter
Ang blackbox exporter mahimong ma-deploy gamit ang mosunod nga command. Namespace monitoring
nagtumong sa Prometheus Operator.
kubectl --namespace=monitoring apply -f blackbox-exporter.yaml
Siguroha nga ang tanan nga mga serbisyo nagdagan gamit ang mosunod nga sugo:
kubectl --namespace=monitoring get all --selector=app=prometheus-blackbox-exporter
Pagsusi sa blackbox
Mahimo nimong ma-access ang Blackbox exporter web interface gamit ang port-forward
:
kubectl --namespace=monitoring port-forward svc/prometheus-blackbox-exporter 9115:9115
Sumpaysumpaya ang Blackbox exporter web interface pinaagi sa web browser sa
Kung moadto ka sa adres
Sukatan nga bili probe_success
katumbas sa 1 nagpasabut nga malampuson nga pagsusi. Ang kantidad nga 0 nagpaila sa usa ka sayup.
Pagpahimutang sa Prometheus
Pagkahuman sa pag-deploy sa BlackBox exporter, among gi-configure ang Prometheus sa prometheus-additional.yaml
.
- job_name: 'kube-api-blackbox'
scrape_interval: 1w
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://www.google.com
- http://www.example.com
- https://prometheus.io
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115 # The blackbox exporter.
Kami nagmugna Secret
gamit ang mosunod nga sugo.
PROMETHEUS_ADD_CONFIG=$(cat prometheus-additional.yaml | base64)
cat << EOF | kubectl --namespace=monitoring apply -f -
apiVersion: v1
kind: Secret
metadata:
name: additional-scrape-configs
type: Opaque
data:
prometheus-additional.yaml: $PROMETHEUS_ADD_CONFIG
EOF
Ipasabut additional-scrape-configs
para sa Prometheus Operator gamit additionalScrapeConfigs
.
kubectl --namespace=monitoring edit prometheuses k8s
...
spec:
additionalScrapeConfigs:
key: prometheus-additional.yaml
name: additional-scrape-configs
Moadto kami sa Prometheus web interface ug susihon ang mga sukatan ug mga katuyoan.
kubectl --namespace=monitoring port-forward svc/prometheus-k8s 9090:9090
Nakita namon ang mga sukatan ug katuyoan sa Blackbox.
Pagdugang mga lagda alang sa mga pahibalo (alerto)
Aron makadawat mga pahibalo gikan sa Blackbox exporter, kami magdugang mga lagda sa Prometheus Operator.
kubectl --namespace=monitoring edit prometheusrules prometheus-k8s-rules
...
- name: blackbox-exporter
rules:
- alert: ProbeFailed
expr: probe_success == 0
for: 5m
labels:
severity: error
annotations:
summary: "Probe failed (instance {{ $labels.instance }})"
description: "Probe failedn VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: SlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "Slow probe (instance {{ $labels.instance }})"
description: "Blackbox probe took more than 1s to completen VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: HttpStatusCode
expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
for: 5m
labels:
severity: error
annotations:
summary: "HTTP Status Code (instance {{ $labels.instance }})"
description: "HTTP status code is not 200-399n VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: SslCertificateWillExpireSoon
expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
for: 5m
labels:
severity: warning
annotations:
summary: "SSL certificate will expire soon (instance {{ $labels.instance }})"
description: "SSL certificate expires in 30 daysn VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: SslCertificateHasExpired
expr: probe_ssl_earliest_cert_expiry - time() <= 0
for: 5m
labels:
severity: error
annotations:
summary: "SSL certificate has expired (instance {{ $labels.instance }})"
description: "SSL certificate has expired alreadyn VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: HttpSlowRequests
expr: avg_over_time(probe_http_duration_seconds[1m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "HTTP slow requests (instance {{ $labels.instance }})"
description: "HTTP request took more than 1sn VALUE = {{ $value }}n LABELS: {{ $labels }}"
- alert: SlowPing
expr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "Slow ping (instance {{ $labels.instance }})"
description: "Blackbox ping took more than 1sn VALUE = {{ $value }}n LABELS: {{ $labels }}"
Sa Prometheus web interface, adto sa Status => Rules ug pangitaa ang alert rules para sa blackbox-exporter.
Pag-configure sa Kubernetes API Server SSL Certificate Expiration Notifications
Atong i-configure ang Kubernetes API Server SSL certificate expiration monitoring. Magpadala kini og mga pahibalo kausa sa usa ka semana.
Pagdugang sa Blackbox exporter module para sa Kubernetes API Server Authentication.
kubectl --namespace=monitoring edit configmap prometheus-blackbox-exporter
...
kube-api:
http:
method: GET
no_follow_redirects: false
preferred_ip_protocol: ip4
tls_config:
insecure_skip_verify: false
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
valid_http_versions:
- HTTP/1.1
- HTTP/2
valid_status_codes: []
prober: http
timeout: 5s
Pagdugang Prometheus scrape configuration
- job_name: 'kube-api-blackbox'
metrics_path: /probe
params:
module: [kube-api]
static_configs:
- targets:
- https://kubernetes.default.svc/api
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115 # The blackbox exporter.
Paggamit sa Prometheus Secret
PROMETHEUS_ADD_CONFIG=$(cat prometheus-additional.yaml | base64)
cat << EOF | kubectl --namespace=monitoring apply -f -
apiVersion: v1
kind: Secret
metadata:
name: additional-scrape-configs
type: Opaque
data:
prometheus-additional.yaml: $PROMETHEUS_ADD_CONFIG
EOF
Pagdugang mga lagda sa alerto
kubectl --namespace=monitoring edit prometheusrules prometheus-k8s-rules
...
- name: k8s-api-server-cert-expiry
rules:
- alert: K8sAPIServerSSLCertExpiringAfterThreeMonths
expr: probe_ssl_earliest_cert_expiry{job="kube-api-blackbox"} - time() < 86400 * 90
for: 1w
labels:
severity: warning
annotations:
summary: "Kubernetes API Server SSL certificate will expire after three months (instance {{ $labels.instance }})"
description: "Kubernetes API Server SSL certificate expires in 90 daysn VALUE = {{ $value }}n LABELS: {{ $labels }}"
Mapuslanon nga mga link
Source: www.habr.com