跳到主要内容

Prometheus K8S

kubernetes_sd_config

prometheus-operator

  • prometheus-operator/prometheus-operator
  • 功能
    • 通过 CRD 来部署管理 Prometheus,Alertmanager 等组件
    • 简化配置 - versions, persistence, retention policies, replicas
    • Prometheus Target 配置 - 自动监控目标配置 - 通过 annotation 发现
  • 之前是 coreos/prometheus-operator,自 0.41 开始去 coreos,移到独立组织 prometheus-operator 下
  • CRD
    • Prometheus - 部署 Prometheus
    • Alertmanager - 部署 Alertmanager
    • ThanosRuler - 部署 thano rule
    • ServiceMonitor - 配置 service 监控
    • PodMonitor - 配置 pod 监控
    • Probe - 配置静态监控目标
      • blackbox_exporter
    • PrometheusRule - 配置 告警/记录 规则
  • 监控外部可使用 Service/externalName + ServiceMonitor 或使用 additionalScrapeConfigs 静态配置
  • 参考
kubectl api-resources --api-group monitoring.coreos.com
Pod Annotations
annotations:
# 开启后抓取所有端口
prometheus.io/scrape: 'true'
prometheus.io/path: '/metrics'
prometheus.io/port: '80'

之所以会生效是因为

- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
# prometheus.io/scrape
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
# prometheus.io/path
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# prometheus.io/port
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
---
# 定义部署 Prometheus
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
name: kube-prometheus-prometheus
namespace: monitoring
spec:
# 额外的抓取配置
additionalScrapeConfigs:
name: additional-scrape-configs
key: prometheus-additional.yaml
affinity: {} # 节点亲和
alerting:
alertmanagers:
- name: kube-prometheus-alertmanager
namespace: monitoring
pathPrefix: /
port: http
enableAdminAPI: false
# 添加额外标签 - 多集群/租户 可用于标记
externalLabels:
cluster: wener
externalUrl: 'http://kube-prometheus-prometheus.monitoring:9090/'
image: 'docker.io/bitnami/prometheus:2.20.1-debian-10-r12'
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMetadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: kube-prometheus
app.kubernetes.io/name: kube-prometheus
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
probeNamespaceSelector: {}
probeSelector: {}
# 远程写 - 配置类似于 prometheus 的 remote_write
remoteWrite:
- name: my-remote
remoteTimeout: 120s
url: 'https://receive.example.com/api/v1/receive'
# proxyUrl: ''
# tlsConfig: {}
# writeRelabelConfigs: {}

# basic auth 的 secret
basicAuth:
password:
key: password
name: prometheus-basic-auth
optional: false
username:
key: username
name: prometheus-basic-auth
optional: false
# 队列配置 - 调优时使用
queueConfig:
# 默认 5s
batchSendDeadline: 10s
# 默认 500
capacity: 2500

# 目前 promethues 是没有实现的
maxRetries: 0
# 默认 100
maxSamplesPerSend: 5000
maxShards: 1000
minShards: 1

minBackoff: 30ms
maxBackoff: 100ms
remoteRead: []
replicas: 1
resources: {}
retention: 10d
retentionSize: 6GB
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector: {}
securityContext:
fsGroup: 1001
runAsUser: 1001
serviceAccountName: kube-prometheus-prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
# Prometheus 本地存储
storage:
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
storageClassName: local-path

additionalScrapeConfigs

- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

Charts

prometheus-operator/kube-prometheus

  • 通过 jsonet 定制化和安装
  • prometheus-operator/kube-prometheus
    • 组件
      • Prometheus Operator
      • HA Prometheus
      • HA Alertmanager
      • node-exporter
      • Kubernetes Metrics APIs Prometheus Adapter
      • kube-state-metrics
      • Grafana

kube-prometheus-stack

  • 有 grafana - 但不推荐

bitnami/kube-prometheus

# 国内无法访问该 Repo,可使用 https://charts.wener.tech 或 https://wenerme.github.io/charts
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install kube-prometheus -n monitoring bitnami/kube-prometheus

kubectl -n monitoring describe svc/kube-prometheus-prometheus

# http://127.0.0.1:9090
kubectl -n monitoring port-forward svc/kube-prometheus-prometheus 9090

stable/prometheus-operator

  • helm stable/prometheus-operator
  • 类似于 kube-prometheus,但通过 helm 安装
  • 更新维护较慢
    • 目前还是基于 coreos/prometheus-operator 0.38
  • 内容
    • stable/kube-state-metrics
    • stable/prometheus-node-exporter
    • stable/grafana
    • prometheus-operator
    • prometheus
    • alertmanager
    • node-exporter
    • kube-state-metrics
    • service monitors
      • 监控 kube 组件
      • kube-apiserver、kube-scheduler、kube-controller-manager、etcd、kube-dns/coredns、kube-proxy
    • 会配置 dashboards 和 alters
  • 默认导入 kubernetes-monitoring/kubernetes-mixin 图表
  • 与 stable/prometheus 相比
    • 多了 grafana
      • 面板配置
    • 多了 kube 组件监控
    • 多了 operator 用于部署
      • Prometheus
      • Alertmanager
      • ThanosRuler
      • ServiceMonitor
      • PodMonitor
      • Probe
      • PrometheusRule

stable/prometheus

  • 单纯部署 prometheus
  • 包含
  • Pod 注解
    • prometheus.io/scrape: "true"
    • prometheus.io/path: /metrics
    • prometheus.io/port: "8080"
  • prometheus 默认 --storage.tsdb.retention.time 15d
server:
persistentVolume:
enabled: false
global:
scrape_interval: 10s

alertmanager:
enabled: false
pushgateway:
enabled: false

FAQ

CustomResourceDefinition.apiextensions.k8s.io "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

syncPolicy:
syncOptions:
- ServerSideApply=true
- CreateNamespace=true

spec.scrapeConfigSelector: field not declared in schema

# Create new CRDs, e.g.:
kubectl create -f crd-prometheusagents.yaml
kubectl create -f crd-scrapeconfigs.yaml

# Patch existing CRDs, e.g.:
kubectl patch crd alertmanagerconfigs.monitoring.coreos.com --patch-file crd-alertmanagerconfigs.yaml
kubectl patch crd alertmanagers.monitoring.coreos.com --patch-file crd-alertmanagers.yaml
kubectl patch crd podmonitors.monitoring.coreos.com --patch-file crd-podmonitors.yaml
kubectl patch crd probes.monitoring.coreos.com --patch-file crd-probes.yaml
kubectl patch crd prometheuses.monitoring.coreos.com --patch-file crd-prometheuses.yaml
kubectl patch crd prometheusrules.monitoring.coreos.com --patch-file crd-prometheusrules.yaml
kubectl patch crd servicemonitors.monitoring.coreos.com --patch-file crd-servicemonitors.yaml
kubectl patch crd thanosrulers.monitoring.coreos.com --patch-file crd-thanosrulers.yaml