目錄
1.新建命名空間monitor
2.部署
2.1部署cadvisor
2.2部署node_exporter
2.3部署prometheus
2.4部署rbac權(quán)限
2.5.部署 metrics
2.6部署grafana
3.測試監(jiān)控效果
參考文章:
k8s集群部署cadvisor+node-exporter+prometheus+grafana監(jiān)控系統(tǒng) - cyh00001 - 博客園
準(zhǔn)備工作:
Cluster集群節(jié)點(diǎn)介紹:
master:192.168.136.21(以下所步驟都在該節(jié)點(diǎn)進(jìn)行)
worker:192.168.136.22
worker:192.168.136.23
##vim縮進(jìn)混亂,冒號模式下,:set paste進(jìn)入黏貼模式,:set nopaste退出黏貼模式(默認(rèn))。##
1.新建命名空間monitor
kubectl create ns monitor
拉取cadvisor鏡像,由于官方的鏡像在在谷歌鏡像中,國內(nèi)無法訪問,我這里直接用別人的,直接拉取即可,注意鏡像名是 lagoudocker/cadvisor:v0.37.0。
docker pull lagoudocker/cadvisor:v0.37.0?
2.部署
新建 /opt/cadvisor_prome_gra 目錄,配置文件較多,單獨(dú)新建一個目錄。
2.1部署cadvisor
部署cadvisor的DaemonSet資源,DaemonSet資源可以保證集群內(nèi)的每一個節(jié)點(diǎn)運(yùn)行同一組相同的pod,即使是新加入的節(jié)點(diǎn)也會自動創(chuàng)建對應(yīng)的pod。
?vim case1-daemonset-deploy-cadvisor.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cadvisor
namespace: monitor
spec:
selector:
matchLabels:
app: cAdvisor
template:
metadata:
labels:
app: cAdvisor
spec:
tolerations: #污點(diǎn)容忍,忽略master的NoSchedule
- effect: NoSchedule
key: node-role.kubernetes.io/master
hostNetwork: true
restartPolicy: Always # 重啟策略
containers:
- name: cadvisor
image: lagoudocker/cadvisor:v0.37.0
imagePullPolicy: IfNotPresent # 鏡像策略
ports:
- containerPort: 8080
volumeMounts:
- name: root
mountPath: /rootfs
- name: run
mountPath: /var/run
- name: sys
mountPath: /sys
- name: docker
mountPath: /var/lib/containerd
volumes:
- name: root
hostPath:
path: /
- name: run
hostPath:
path: /var/run
- name: sys
hostPath:
path: /sys
- name: docker
hostPath:
path: /var/lib/containerd
kubectl apply -f case1-daemonset-deploy-cadvisor.yaml
kubectl get pod -n monitor -owide 查詢
因?yàn)橛腥齻€節(jié)點(diǎn),所以會有三個pod,如果后期加入工作節(jié)點(diǎn),DaemonSet會自動添加。?
測試cadvisor? <masterIP>:<8080>
2.2部署node_exporter
部署node-exporter的DaemonSet資源和Service資源。
vim case2-daemonset-deploy-node-exporter.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: monitor
labels:
k8s-app: node-exporter
spec:
selector:
matchLabels:
k8s-app: node-exporter
template:
metadata:
labels:
k8s-app: node-exporter
spec:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
containers:
- image: prom/node-exporter:v1.3.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
protocol: TCP
name: metrics
volumeMounts:
- mountPath: /host/proc
name: proc
- mountPath: /host/sys
name: sys
- mountPath: /host
name: rootfs
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
hostNetwork: true
hostPID: true
---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
labels:
k8s-app: node-exporter
name: node-exporter
namespace: monitor
spec:
type: NodePort
ports:
- name: http
port: 9100
nodePort: 39100
protocol: TCP
selector:
k8s-app: node-exporter
?kubectl get pod -n monitor
?
?驗(yàn)證 node-exporter 數(shù)據(jù) ,注意是9100端口,<nodeIP>:<9100>
2.3部署prometheus
prometheus資源包括ConfigMap資源、Deployment資源、Service資源。
vim case3-1-prometheus-cfg.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
app: prometheus
name: prometheus-config
namespace: monitor
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'kubernetes-node'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-node-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-apiserver'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_service_name
注意case3-2配置文件中的k8s-master記得更改,不能改成本地主機(jī)ip(原因未知)
設(shè)置192.168.136.21(k8s-master)節(jié)點(diǎn)為prometheus數(shù)據(jù)存放路徑 /data/prometheus。
vim?case3-2-prometheus-deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-server
namespace: monitor
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
component: server
#matchExpressions:
#- {key: app, operator: In, values: [prometheus]}
#- {key: component, operator: In, values: [server]}
template:
metadata:
labels:
app: prometheus
component: server
annotations:
prometheus.io/scrape: 'false'
spec:
nodeName: k8s-master
serviceAccountName: monitor
containers:
- name: prometheus
image: prom/prometheus:v2.31.2
imagePullPolicy: IfNotPresent
command:
- prometheus
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention=720h
ports:
- containerPort: 9090
protocol: TCP
volumeMounts:
- mountPath: /etc/prometheus/prometheus.yml
name: prometheus-config
subPath: prometheus.yml
- mountPath: /prometheus/
name: prometheus-storage-volume
volumes:
- name: prometheus-config
configMap:
name: prometheus-config
items:
- key: prometheus.yml
path: prometheus.yml
mode: 0644
- name: prometheus-storage-volume
hostPath:
path: /data/prometheusdata
type: Directory
創(chuàng)建sa和clusterrolebinding
kubectl create serviceaccount monitor -n monitor
kubectl create clusterrolebinding monitor-clusterrolebinding -n monitor --clusterrole=cluster-admin --serviceaccount=monitor:monitor
kubectl apply -f case3-2-prometheus-deployment.yaml
?case3-2這一步有大坑,用“k8s-master"可以,但是用“192.168.136.21”就不可以!Deployment和pod一直起不來,查看pod的日志顯示找不到“192.168.136.21”主機(jī),改成“k8s-master”也不行,幾天后突然就好了,期間有關(guān)過機(jī)。(原因未知)
?
vim?case3-3-prometheus-svc.yaml
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitor
labels:
app: prometheus
spec:
type: NodePort
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
protocol: TCP
selector:
app: prometheus
component: server
kubectl apply -f case3-3-prometheus-svc.yaml
2.4部署rbac權(quán)限
包括Secret資源、ServiceAccount資源、ClusterRole資源、ClusterRoleBinding資源,ServiceAccount是服務(wù)賬戶,ClusterRole是權(quán)限規(guī)則,ClusterRoleBinding是將ServiceAccount和ClusterRole進(jìn)行綁定。
pod和 apiserver 的認(rèn)證信息通過 secret 進(jìn)行定義,由于認(rèn)證信息屬于敏感信息,所以需要保存在secret 資源當(dāng)中,并以存儲卷的方式掛載到 Pod 當(dāng)中。從而讓 Pod 內(nèi)運(yùn)行的應(yīng)用通過對應(yīng)的secret 中的信息來連接 apiserver,并完成認(rèn)證。
rbac權(quán)限管理是k8s的一套認(rèn)證系統(tǒng),上面只是簡單講解,深入了解可以瀏覽:k8s APIserver 安全機(jī)制之 rbac 授權(quán)_笨小孩@GF 知行合一的博客-CSDN博客_k8s rbac 寫在哪個文件
vim?case4-prom-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: monitor
---
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
name: monitor-token
namespace: monitor
annotations:
kubernetes.io/service-account.name: "prometheus"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups:
- ""
resources:
- nodes
- services
- endpoints
- pods
- nodes/proxy
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- nodes/metrics
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
#apiVersion: rbac.authorization.k8s.io/v1beta1
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitor
kubectl apply -f case4-prom-rbac.yaml
2.5.部署 metrics
包括Deployment資源、Service資源、ServiceAccount資源、ClusterRole資源、ClusterRoleBinding資源。
注意是部署在kube-system!
vim?case5-kube-state-metrics-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/kube-state-metrics:v2.6.0
ports:
- containerPort: 8080
---
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-state-metrics
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]
verbs: ["list", "watch"]
- apiGroups: ["extensions"]
resources: ["daemonsets", "deployments", "replicasets"]
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources: ["cronjobs", "jobs"]
verbs: ["list", "watch"]
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
name: kube-state-metrics
namespace: kube-system
labels:
app: kube-state-metrics
spec:
type: NodePort
ports:
- name: kube-state-metrics
port: 8080
targetPort: 8080
nodePort: 31666
protocol: TCP
selector:
app: kube-state-metrics
?kubectl apply -f?case5-kube-state-metrics-deploy.yaml
2.6部署grafana
grafana圖形界面對接prometheus數(shù)據(jù)源,包括Deployment資源、Service資源。
vim?grafana-enterprise.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana-enterprise
namespace: monitor
spec:
replicas: 1
selector:
matchLabels:
app: grafana-enterprise
template:
metadata:
labels:
app: grafana-enterprise
spec:
containers:
- image: grafana/grafana
imagePullPolicy: Always
#command:
# - "tail"
# - "-f"
# - "/dev/null"
securityContext:
allowPrivilegeEscalation: false
runAsUser: 0
name: grafana
ports:
- containerPort: 3000
protocol: TCP
volumeMounts:
- mountPath: "/var/lib/grafana"
name: data
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 2500Mi
volumes:
- name: data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: monitor
spec:
type: NodePort
ports:
- port: 80
targetPort: 3000
nodePort: 31000
selector:
app: grafana-enterprise
kubectl apply -f?grafana-enterprise.yaml
賬號admin 密碼admin
添加數(shù)據(jù)源data sources,命名為prometheus,注意端口號30090。
?添加模板13332,還可以添加其他模板,例如:14981、13824、14518。
點(diǎn)擊左側(cè)“+”號,選擇“import”導(dǎo)入模板。
?模板13332
cadvisor模板編號14282,此處有個bug尚未解決,可以監(jiān)控集群內(nèi)所有容器的性能資源,但如果選中其中一個容器就無法顯示數(shù)據(jù)。(應(yīng)該是可以解決的)。
?現(xiàn)在顯示的是pod的ID,不方便管理員瀏覽,為了方便顯示成pod的name,模板右側(cè)的“設(shè)置圖標(biāo)”,選擇“Variables”,選擇第二個,將“name”改成“pod”即可。
??儀表臺的每一個板塊也需要更改,點(diǎn)擊板塊標(biāo)題,選擇“Edit”,“name”改成“pod”。
3.測試監(jiān)控效果
新建名為nginx01的deployment任務(wù),測試監(jiān)控結(jié)果。
vim nginx01.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx01
spec:
replicas: 2
selector:
matchLabels:
app: nginx01
template:
metadata:
labels:
app: nginx01
spec:
containers:
- name: nginx
image: nginx:1.7.9
?kubectl apply -f nginx01.yaml?
出現(xiàn)兩個nginx01,因?yàn)樵O(shè)置了2個副本。
文章來源:http://www.zghlxwxcb.cn/news/detail-795405.html
?至此,cadvisor+prometheus+grafana集群監(jiān)控部署完成。文章來源地址http://www.zghlxwxcb.cn/news/detail-795405.html
到了這里,關(guān)于k8s集群監(jiān)控cadvisor+prometheus+grafana部署的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!