概述
從前面的基礎(chǔ)章節(jié)了解到,kubernetes的mastere節(jié)點(diǎn)基礎(chǔ)組件大致包括kube-apiserver、kube-controller-manager、kube-scheduler,還有一個(gè)重要的數(shù)據(jù)存儲服務(wù)etcd。這些主件作為master節(jié)點(diǎn)的服務(wù)相當(dāng)重要,很多時(shí)候我們都需要保證其高可用。下面將分享在實(shí)際工作中的一些常見做法。
kube-apiserver 高可用方案
在使用私有云的時(shí)候,我們一般采用haproxy+keepalived的方式來做高可用(注意:在生成證書的時(shí)候一定要把VIP地址加入進(jìn)證書)
[root@k8s-m1 k8s-resource]# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text -noout
......
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication
X509v3 Subject Alternative Name:
DNS:k8s-m1, DNS:k8s-m2, DNS:k8s-m3, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, IP Address:10.96.0.1, IP Address:192.168.2.140, IP Address:192.168.2.250, IP Address:127.0.0.1, IP Address:192.168.2.141, IP Address:192.168.2.142
......
##可以看到有192.468.2.250這個(gè)VIP
haproxy+keepalived 安裝和配置
[root@k8s-m1 ]# yum install haproxy keepalived -y
#以我現(xiàn)有測試集群為例,節(jié)點(diǎn)分別為192.168.2.140/141/142,vip為192.168.2.250,修改配置
[root@k8s-m1 ]# cat <<EOF > /etc/haproxy/haproxy.cfg
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor
listen stats
bind *:8006
mode http
stats enable
stats hide-version
stats uri /stats
stats refresh 30s
stats realm Haproxy\ Statistics
stats auth admin:admin
frontend k8s-api
bind 0.0.0.0:8443
bind 127.0.0.1:8443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-api
backend k8s-api
mode tcp
option tcplog
option httpchk GET /healthz
http-check expect string ok
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server api1 192.168.2.140:6443 check check-ssl verify none
server api2 192.168.2.141:6443 check check-ssl verify none
server api3 192.168.2.142:6443 check check-ssl verify none
EOF
[root@k8s-m1 ]# cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
enable_script_security
}
vrrp_script haproxy-check {
user root
script "/bin/bash /etc/keepalived/check_haproxy.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance haproxy-vip {
state MASTER ##注意修改,其中主為MASTER,從為BACKUP
priority 100 ##注意修改,數(shù)字越大,優(yōu)先級越高,主>從
interface eth0 #注意實(shí)際環(huán)境中網(wǎng)卡名字,有的是ens*啥的
virtual_router_id 48 #所有節(jié)點(diǎn)的id要一致
advert_int 3
unicast_src_ip 192.168.2.140 # 本機(jī)IP
unicast_peer {
192.168.2.141 # 對端IP
192.168.2.142 # 對端IP
}
virtual_ipaddress {
192.168.2.250/24 # VIP地址
}
track_script {
haproxy-check
}
}
EOF
[root@k8s-m1 ]# cat <<'EOF'> /etc/keepalived/check_haproxy.sh
#!/bin/bash
VIRTUAL_IP=192.168.2.250.250
errorExit() {
echo "*** $*" 1>&2
exit 1
}
if ip addr | grep -q $VIRTUAL_IP ; then
curl -s --max-time 2 --insecure https://${VIRTUAL_IP}:8443/healthz -o /dev/null || errorExit "Error GET https://${VIRTUAL_IP}:8443/healthz"
fi
EOF
[root@k8s-m1 ]# systemctl enable haproxy
[root@k8s-m1 ]# systemctl enable keepalived
[root@k8s-m1 ]# systemctl restart haproxy
[root@k8s-m1 ]# systemctl restart keepalived
#其他節(jié)點(diǎn)類似配置
到此kube-apiserver高可用就搭建完成,在所有連接apiserver中的地址都填寫這個(gè)VIP地址尤其是kubectl客服端中連接kubernetes集群的地址。
kube-scheduler和kube-controller-manager高可用方案
在部署kube-scheduer和kube-controller-manager使用一主多從的高可用方案,在同一時(shí)刻只允許一個(gè)服務(wù)處理具體的任務(wù)。Kubernetes中實(shí)現(xiàn)了一套簡單的選主邏輯,依賴Etcd實(shí)現(xiàn)scheduler和controller-manager的選主功能。
scheduler和controller-manager在啟動(dòng)的時(shí)候設(shè)置了--leader-elect=true參數(shù),啟動(dòng)后將通過競爭選舉機(jī)制產(chǎn)生一個(gè) leader 節(jié)點(diǎn),只有在獲取leader節(jié)點(diǎn)身份后才可以執(zhí)行具體的業(yè)務(wù)邏輯。它們分別會在Etcd中創(chuàng)建kube-scheduler和kube-controller-manager的endpoint,endpoint的信息中記錄了當(dāng)前的leader節(jié)點(diǎn)信息,以及記錄的上次更新時(shí)間。leader節(jié)點(diǎn)會定期更新endpoint的信息,維護(hù)自己的leader身份。每個(gè)從節(jié)點(diǎn)的服務(wù)都會定期檢查endpoint的信息,如果endpoint的信息在時(shí)間范圍內(nèi)沒有更新,它們會嘗試更新自己為leader節(jié)點(diǎn)。scheduler服務(wù)以及controller-manager服務(wù)之間不會進(jìn)行通信,利用Etcd的強(qiáng)一致性,能夠保證在分布式高并發(fā)情況下leader節(jié)點(diǎn)的全局唯一性。
當(dāng)集群中的leader節(jié)點(diǎn)服務(wù)異常后,其它節(jié)點(diǎn)的服務(wù)會嘗試更新自身為leader節(jié)點(diǎn),當(dāng)有多個(gè)節(jié)點(diǎn)同時(shí)更新endpoint時(shí),由Etcd保證只有一個(gè)服務(wù)的更新請求能夠成功。通過這種機(jī)制sheduler和controller-manager可以保證在leader節(jié)點(diǎn)宕機(jī)后其它的節(jié)點(diǎn)可以順利選主,保證服務(wù)故障后快速恢復(fù)。
當(dāng)集群中的網(wǎng)絡(luò)出現(xiàn)故障時(shí)對服務(wù)的選主影響不是很大,因?yàn)閟cheduler和controller-manager是依賴Etcd進(jìn)行選主的,在網(wǎng)絡(luò)故障后,可以和Etcd通信的主機(jī)依然可以按照之前的邏輯進(jìn)行選主,就算集群被切分,Etcd也可以保證同一時(shí)刻只有一個(gè)節(jié)點(diǎn)的服務(wù)處于leader狀態(tài)。
查看kube-controller-manager當(dāng)前的 leader
[root@k8s-m1 k8s-resource]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-m2_433b35d2-7454-46ab-9743-9d9547421c95","leaseDurationSeconds":15,"acquireTime":"2023-07-03T02:06:11Z","renewTime":"2023-07-20T02:29:26Z","leaderTransitions":134}'
endpoints.kubernetes.io/last-change-trigger-time: "2023-07-03T10:07:05+08:00"
creationTimestamp: "2022-11-07T08:00:04Z"
labels:
k8s-app: kube-controller-manager
service.kubernetes.io/headless: ""
可見,kube-controller-manager組件當(dāng)前的 leader 為 k8s-m2 節(jié)點(diǎn)
查看kube-scheduler當(dāng)前的 leader
[root@k8s-m1 k8s-resource]# kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-m2_6d296080-bbed-4cca-88d2-cb9ccb3f73b5","leaseDurationSeconds":15,"acquireTime":"2023-07-03T02:05:48Z","renewTime":"2023-07-20T02:29:41Z","leaderTransitions":127}'
endpoints.kubernetes.io/last-change-trigger-time: "2023-07-03T10:07:23+08:00"
creationTimestamp: "2022-11-07T08:00:04Z"
labels:
k8s-app: kube-scheduler
service.kubernetes.io/headless: ""
可見,kube-scheduler組件當(dāng)前的 leader 為 k8s-m2節(jié)點(diǎn)。
可以在測試環(huán)境隨便找一個(gè)或兩個(gè) master 節(jié)點(diǎn),停掉kube-scheduler或者kube-controller-manager服務(wù),看其它節(jié)點(diǎn)是否獲取了 leader 權(quán)限(systemd 日志)
ETCD服務(wù)的高可用方案
在生產(chǎn)環(huán)境,一般建議通過systemd的方式部署多節(jié)點(diǎn)的etcd集群(基數(shù)個(gè)),方便排錯(cuò)備份等工作。每個(gè) etcd cluster 都是有若干個(gè) member 組成的,每個(gè) member 是一個(gè)獨(dú)立運(yùn)行的 etcd 實(shí)例,單臺機(jī)器上可以運(yùn)行多個(gè) member。
在正常運(yùn)行的狀態(tài)下,集群中會有一個(gè) leader,其余的 member 都是 followers。leader 向 followers 同步日志,保證數(shù)據(jù)在各個(gè) member 都有副本。leader 還會定時(shí)向所有的 member 發(fā)送心跳報(bào)文,如果在規(guī)定的時(shí)間里 follower 沒有收到心跳,就會重新進(jìn)行選舉。
客戶端所有的請求都會先發(fā)送給 leader,leader 向所有的 followers 同步日志,等收到超過半數(shù)的確認(rèn)后就把該日志存儲到磁盤,并返回響應(yīng)客戶端。采用了 raft 算法,實(shí)現(xiàn)分布式系統(tǒng)數(shù)據(jù)的可用性和一致性
每個(gè) etcd 服務(wù)有三大主要部分組成:raft 實(shí)現(xiàn)、WAL 日志存儲、數(shù)據(jù)的存儲和索引。WAL 會在本地磁盤(就是之前提到的 --data-dir)上存儲日志內(nèi)容(wal file)和快照(snapshot)。
查看當(dāng)前etcd集群狀態(tài)
[root@k8s-m1 ~]# cat <<EOF > /etc/profile.d/etcd.sh
alias etcd_v2='etcdctl --cert-file /etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key-file /etc/kubernetes/pki/etcd/healthcheck-client.key \
--ca-file /etc/kubernetes/pki/etcd/ca.crt \
--endpoints https://192.168.2.140:2379,https://192.168.2.141:2379,https://192.168.2.142:2379'
alias etcd_v3='ETCDCTL_API=3 \
etcdctl \
--cert /etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key /etc/kubernetes/pki/etcd/healthcheck-client.key \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--endpoints https://192.168.2.140:2379,https://192.168.2.141:2379,https://192.168.2.142:2379'
EOF
[root@k8s-m1 ~]# source /etc/profile.d/etcd.sh
[root@k8s-m1 pki]# etcd_v3 --write-out=table endpoint status
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://192.168.2.140:2379 | 19df3a9852e0345a | 3.4.13 | 24 MB | true | 267803 | 107799988 |
| https://192.168.2.141:2379 | 66d402f1ef2c996e | 3.4.13 | 24 MB | false | 267803 | 107799988 |
| https://192.168.2.142:2379 | 3bb3629d60bef3f6 | 3.4.13 | 24 MB | false | 267803 | 107799989 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
[root@k8s-m1 pki]#
可以發(fā)先當(dāng)前etcd集群中192.168.2.140為leader節(jié)點(diǎn)。將192.168.2.140節(jié)點(diǎn)上的etcd停掉后,集群會自動(dòng)選舉去一個(gè)新的leader節(jié)點(diǎn)。文章來源:http://www.zghlxwxcb.cn/news/detail-604829.html
[root@k8s-m1 pki]# systemctl stop etcd
[root@k8s-m1 pki]# etcd_v3 --write-out=table endpoint status
Failed to get the status of endpoint https://192.168.2.140:2379 (context deadline exceeded)
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://192.168.2.141:2379 | 66d402f1ef2c996e | 3.4.13 | 24 MB | true | 267804 | 107800390 |
| https://192.168.2.142:2379 | 3bb3629d60bef3f6 | 3.4.13 | 24 MB | false | 267804 | 107800390 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
更多關(guān)于kubernetes的知識分享,請前往博客主頁。編寫過程中,難免出現(xiàn)差錯(cuò),敬請指出文章來源地址http://www.zghlxwxcb.cn/news/detail-604829.html
到了這里,關(guān)于【kubernetes系列】kubernetes之基礎(chǔ)組件高可用方案的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!