作者:運維有術
前言
知識點
- 定級:入門級
- KubeKey 安裝部署 ARM 版 KubeSphere 和 Kubernetes
- ARM 版 KubeSphere 和 Kubernetes 常見問題
實戰(zhàn)服務器配置 (個人云上測試服務器)
主機名 | IP | CPU | 內存 | 系統(tǒng)盤 | 數(shù)據(jù)盤 | 用途 |
---|---|---|---|---|---|---|
ks-master-1 | 172.16.33.16 | 6 | 16 | 50 | 200 | KubeSphere/k8s-master |
ks-master-2 | 172.16.33.22 | 6 | 16 | 50 | 200 | KubeSphere/k8s-master |
ks-master-3 | 172.16.33.23 | 6 | 16 | 50 | 200 | KubeSphere/k8s-master |
合計 | 10 | 18 | 48 | 150 | 600+ |
實戰(zhàn)環(huán)境涉及軟件版本信息
-
服務器芯片:Kunpeng-920
-
操作系統(tǒng):openEuler 22.03 LTS SP2 aarch64
-
KubeSphere:v3.4.0
-
Kubernetes:v1.26.5
-
Containerd:1.6.4
-
KubeKey: v3.0.10
1. 本文簡介
本文是 ARM 版 OpenEuler 22.03 部署 KubeSphere v3.4.0 不完全指南 的續(xù)集,受限于字符數(shù)量限制,將完整的文檔拆成了兩篇。
上篇我們完成了 KubeSphere 和 Kubernetes 集群的部署,下篇我們主要介紹以下兩個主題內容:
- 部署測試資源驗證 KubeSphere 和 Kubernetes 的基本功能是否正常
- 解決 ARM 版 KubeSphere 和 Kubernetes 服務組件異常的問題(本文核心價值)
KubeSphere 和 Kubernetes 在 ARM 架構 和 X86 架構的服務器上部署,最大的區(qū)別在于所有服務使用的容器鏡像架構類型的不同,KubeSphere 開源版對于 ARM 架構的默認支持可以實現(xiàn) KubeSphere-Core 功能,即可以實現(xiàn)最小化的 KubeSphere 和完整的 Kubernetes 集群的部署。當啟用了 KubeSphere 可插拔組件時,會遇到個別組件部署失敗的情況,需要我們手工替換官方或是第三方提供的 ARM 版鏡像或是根據(jù)官方源碼手工構建 ARM 版鏡像。
本文詳細的記錄了在完成最終部署的過程中,遇到的各種問題報錯及相應的解決方案。由于能力有限,本文中所遇到的架構不兼容的問題,均采用了手工替換第三方倉庫或是官方其他倉庫相同或是相似 ARM 版本鏡像的方案。
建議計劃在生產中使用的讀者最好能具備使用官方源碼及 DockerFile 構建與 X86 版本完全相同的 ARM 版容器鏡像的能力,不要替換相近版本或是使用第三方鏡像。也正是因為本文并沒有涉及利用官方源碼及 Dockerfile 構建 ARM 鏡像的相關內容,所以才取名為不完全指南。
2. 安裝過程中出現(xiàn)的異常及解決方案
2.1 Kubernetes 相關組件二進制包格式不匹配
- 報錯現(xiàn)象
# containerd 啟動失敗
Oct 12 10:44:00 KP-Euler-ZH-01 systemd[1]: Starting containerd container runtime...
Oct 12 10:44:00 KP-Euler-ZH-01 (ntainerd)[27271]: containerd.service: Failed to execute /usr/bin/containerd: Exec format error
Oct 12 10:44:00 KP-Euler-ZH-01 (ntainerd)[27271]: containerd.service: Failed at step EXEC spawning /usr/bin/containerd: Exec format error
Oct 12 10:44:00 KP-Euler-ZH-01 systemd[1]: containerd.service: Main process exited, code=exited, status=203/EXEC
Oct 12 10:44:00 KP-Euler-ZH-01 systemd[1]: containerd.service: Failed with result 'exit-code'.
Oct 12 10:44:00 KP-Euler-ZH-01 systemd[1]: Failed to start containerd container runtime.
Oct 12 10:44:05 KP-Euler-ZH-01 systemd[1]: containerd.service: Scheduled restart job, restart counter is at 241.
Oct 12 10:44:05 KP-Euler-ZH-01 systemd[1]: Stopped containerd container runtime.
# 查看 kk 下載的二進制包
[root@ks-master-1 kubekey]# ll kubekey/kube/v1.26.5/ -R
kubekey/kube/v1.26.5/:
total 4
drwxr-xr-x. 2 root root 4096 Oct 12 10:18 amd64
kubekey/kube/v1.26.5/amd64:
total 211048
-rw-r--r--. 1 root root 46788608 Oct 12 10:04 kubeadm
-rw-r--r--. 1 root root 48046080 Oct 12 10:18 kubectl
-rw-r--r--. 1 root root 121277432 Oct 12 10:04 kubelet
[root@ks-master-1 kubekey]# ll kubekey/containerd/1.6.4/ -R
kubekey/containerd/1.6.4/:
total 4
drwxr-xr-x. 2 root root 4096 Oct 12 10:20 amd64
kubekey/containerd/1.6.4/amd64:
total 43396
-rw-r--r--. 1 root root 44436699 Oct 12 10:21 containerd-1.6.4-linux-amd64.tar.gz
- 解決方案
kk 默認下載的二進制包都是 amd64 格式的不適用于 ARM 環(huán)境,需要手工修改創(chuàng)建集群的配置文件
在 spec.hosts 的主機配置中增加節(jié)點的 arch 屬性
- 正確的安裝效果
安裝過程日志輸出比較多,本文只展示重要的一點,一定要觀察下載二進制包的時候,格式為 arm64,其它的日志輸出,為了節(jié)省篇幅這里就不展示了。
Continue this installation? [yes/no]: yes
10:49:21 CST success: [LocalHost]
10:49:21 CST [NodeBinariesModule] Download installation binaries
10:49:21 CST message: [localhost]
downloading arm64 kubeadm v1.26.5 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 43.3M 100 43.3M 0 0 1035k 0 0:00:42 0:00:42 --:--:-- 1212k
10:50:04 CST message: [localhost]
downloading arm64 kubelet v1.26.5 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 111M 100 111M 0 0 1018k 0 0:01:51 0:01:51 --:--:-- 1027k
10:51:56 CST message: [localhost]
downloading arm64 kubectl v1.26.5 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 44.3M 100 44.3M 0 0 1022k 0 0:00:44 0:00:44 --:--:-- 1081k
10:52:41 CST message: [localhost]
downloading arm64 helm v3.9.0 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 43.6M 100 43.6M 0 0 1035k 0 0:00:43 0:00:43 --:--:-- 1181k
10:53:24 CST message: [localhost]
downloading arm64 kubecni v1.2.0 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 36.2M 100 36.2M 0 0 1039k 0 0:00:35 0:00:35 --:--:-- 1236k
10:54:00 CST message: [localhost]
downloading arm64 crictl v1.24.0 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 12.7M 100 12.7M 0 0 1032k 0 0:00:12 0:00:12 --:--:-- 1080k
10:54:13 CST message: [localhost]
downloading arm64 etcd v3.4.13 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15.3M 100 15.3M 0 0 1026k 0 0:00:15 0:00:15 --:--:-- 1074k
10:54:28 CST message: [localhost]
downloading arm64 containerd 1.6.4 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 31.9M 100 31.9M 0 0 1015k 0 0:00:32 0:00:32 --:--:-- 1021k
10:55:01 CST message: [localhost]
downloading arm64 runc v1.1.1 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8837k 100 8837k 0 0 1099k 0 0:00:08 0:00:08 --:--:-- 1182k
10:55:09 CST message: [localhost]
downloading arm64 calicoctl v3.23.2 ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 55.5M 100 55.5M 0 0 1030k 0 0:00:55 0:00:55 --:--:-- 1214k
10:56:04 CST success: [LocalHost]
3. 異常組件及解決方案
由于 KubeSphere 社區(qū)版對 ARM 的支持并不完美,默認僅能保證 KubeSphere-Core 在 ARM 架構下能部署成功,當啟用了插件后,并不是所有插件都有 ARM 鏡像,當沒有對應 ARM 版本鏡像時,系統(tǒng)拉取 x86 版本的鏡像創(chuàng)建并啟動服務,因此會導致架構不同引發(fā)的服務啟動異常,需要根據(jù)報錯信息解決異常。
解決異常的方案有以下幾種:
- 使用異常組件官方其他倉庫或是第三方提供的相同版本的 ARM 鏡像(次優(yōu)方案,優(yōu)先在官方找,實在沒有再找第三方用戶提供的鏡像)
- 使用異常組件官方其他倉庫或是第三方提供的相近版本的 ARM 鏡像(保底方案,僅限于研發(fā)測試環(huán)境)
- 使用官方組件源代碼和 Dockerfile 自己構建 ARM 鏡像(最優(yōu)方案,因暫時能力有限,所以本文并未涉及,后續(xù)可能會有更新)
本小節(jié)的內容完整的記錄了在整個部署過程中遇到的問題及相應的解決方案,也是本文的核心價值所在。
3.1 查看異常組件對應的 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide | grep CrashLoopBackOff | grep -v weave
argocd devops-argocd-applicationset-controller-8486797d4d-72888 0/1 CrashLoopBackOff 24 (5m4s ago) 107m 10.233.103.12 ks-master-1 <none> <none>
istio-system istiod-1-14-6-6576b8664b-28c44 0/1 CrashLoopBackOff 25 (43s ago) 107m 10.233.102.8 ks-master-2 <none> <none>
kubesphere-controls-system default-http-backend-767cdb5fdc-ptqhh 0/1 CrashLoopBackOff 24 (5m5s ago) 108m 10.233.93.9 ks-master-3 <none> <none>
kubesphere-devops-system devops-jenkins-774fdb948b-4rk56 0/1 Init:CrashLoopBackOff 23 (4m57s ago) 107m 10.233.93.18 ks-master-3 <none> <none>
如果你跟我一樣在初始部署的時候就選擇了 weave 組件,在失敗的 Pod 列表中也會有 weave 命名空間下的相應 Pod。因為,weave 已經過時且無法解決,所以在過濾異常 Pod 的時候排除了 weave 相應的 Pod。
上面的輸出結果在排除 weave 以后,也沒有包含全部異常組件。因為在采集這個結果之前,我已經解決了其他幾個組件的問題。
我們先按上面輸出的結果排序來處理異常,不在上面結果中的異常組件的問題報錯及解決過程放在了后面,具體細節(jié)請看下文。
3.2 解決 Argo CD 異常
- 查看異常 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide | grep -v Running | grep -v Completed
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
argocd devops-argocd-applicationset-controller-8486797d4d-72888 0/1 CrashLoopBackOff 1119 (4m4s ago) 3d23h 10.233.103.12 ks-master-1 <none> <none>
istio-system istiod-1-14-6-6576b8664b-28c44 0/1 CrashLoopBackOff 1119 (3m37s ago) 3d23h 10.233.102.8 ks-master-2 <none> <none>
- 查看異常 Pod 的日志(典型的二進制程序架構不匹配造成的,也就是說 x86 的程序,在 ARM 上運行)
[root@ks-master-1 ~]# kubectl logs --all-containers devops-argocd-applicationset-controller-8486797d4d-72888 -n argocd
exec /usr/local/bin/applicationset-controller: exec format error
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 ~]# kubectl describe pods devops-argocd-applicationset-controller-8486797d4d-72888 -n argocd | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset:v0.4.1
- 查看異常 Pod 鏡像架構
# 查看異常 Pod 鏡像架構
[root@ks-master-1 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset:v0.4.1 | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像(相同版本 kubesphere 官方 ARM 版鏡像)
# 找個相同版本的 ARM 架構的鏡像
crictl pull kubespheredev/argocd-applicationset-arm64:v0.4.1
- 鏡像重新打 tag(為了保持鏡像名稱風格一致)
ctr -n k8s.io images tag docker.io/kubespheredev/argocd-applicationset-arm64:v0.4.1 registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset-arm64:v0.4.1
- 重新部署組件
# 修改 Deployment 使用的鏡像,并重啟
kubectl set image deployment/devops-argocd-applicationset-controller applicationset-controller=registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset-arm64:v0.4.1 -n argocd
kubectl rollout restart deployment/devops-argocd-applicationset-controller -n argocd
- 驗證新的 Pod 創(chuàng)建并啟動成功
[root@ks-master-1 ~]# kubectl get pods -o wide -n argocd | grep applicationset-controller
devops-argocd-applicationset-controller-864f464855-64zvf 1/1 Running 0 56s 10.233.103.125 ks-master-1 <none> <none>
3.3 解決 Istio 異常
- 查看異常 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide | grep -v Running | grep -v Completed
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
istio-system istiod-1-14-6-6576b8664b-28c44 0/1 CrashLoopBackOff 1122 (3m10s ago) 3d23h 10.233.102.8 ks-master-2 <none> <none>
- 查看異常 Pod 的日志
[root@ks-master-1 ~]# kubectl logs --all-containers istiod-1-14-6-6576b8664b-28c44 -n istio-system
exec /usr/local/bin/pilot-discovery: exec format error
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 ~]# kubectl describe pods istiod-1-14-6-6576b8664b-28c44 -n istio-system | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.14.6
- 查看異常 Pod 鏡像架構
# 查看異常 Pod 鏡像架構
[root@ks-master-2 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.14.6 | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像 (istio 官方相近版本 ARM 鏡像)
# 找個相近版本的 ARM 架構的鏡像(官方沒有 1.14.6 的 ARM 鏡像,從 1.15 開始才原生支持 ARM,所以用了 1.15.7 代替,生產環(huán)境建議自己用 1.14.6 版本的源碼編譯構建)
crictl pull istio/pilot:1.15.7 --platform arm64
# 確保鏡像架構是 arm64
[root@ks-master-2 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.15.7 | grep arch
"architecture": "arm64",
- 鏡像重新打 tag
ctr -n k8s.io images tag docker.io/istio/pilot:1.15.7 registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.15.7
- 重新部署組件
# 修改 Deployment 使用的鏡像,并重啟
kubectl set image deployment/istiod-1-14-6 discovery=registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.15.7 -n istio-system
kubectl rollout restart deployment/istiod-1-14-6 -n istio-system
- 驗證新的 Pod 創(chuàng)建并啟動成功
# 確保新的 Pod 創(chuàng)建并啟動成功
[root@ks-master-1 ~]# kubectl get pods -o wide -n istio-system | grep istio
istiod-1-14-6-58ff9f7cc-59djl 0/1 Pending 0 7m59s <none> <none> <none> <none>
istiod-1-14-6-747f5b86b-bg29h 1/1 Running 0 15m 10.233.102.122 ks-master-2 <none> <none>
# 上面的結果中,有一個 POd 一直處于 Pending 狀態(tài),查看具體原因
[root@ks-master-1 ~]# kubectl events pod --for=pod/istiod-1-14-6-58ff9f7cc-59djl -n istio-system
LAST SEEN TYPE REASON OBJECT MESSAGE
3m26s (x2 over 8m42s) Warning FailedScheduling Pod/istiod-1-14-6-58ff9f7cc-59djl 0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
# 結果顯示 CPU 資源不足,驗證
[root@ks-master-1 ~]# for i in {1..3};do echo ks-master-$i Cpu Usage: && kubectl describe node ks-master-$i | grep cpu | grep -v cpu: ;done
ks-master-1 Cpu Usage:
cpu 3317m (92%) 10500m (291%)
ks-master-2 Cpu Usage:
cpu 3587m (99%) 11910m (330%)
ks-master-3 Cpu Usage:
cpu 3317m (92%) 10800m (300%)
# 增加服務器 CPU 資源,重啟服務器后,再次查看
[root@ks-master-1 ~]# kubectl get pods -o wide -n istio-system | grep istio
istiod-1-14-6-6d4dbc56df-n5z9g 1/1 Running 0 17s 10.233.102.149 ks-master-2 <none> <none>
3.4 解決 http-backupend 異常
- 查看異常 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide | grep -v Running | grep -v Completed
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kubesphere-controls-system default-http-backend-767cdb5fdc-ptqhh 0/1 CrashLoopBackOff 1108 (4m7s ago) 3d22h 10.233.93.9 ks-master-3 <none> <none>
- 查看異常 Pod 的日志
[root@ks-master-1 ~]# kubectl logs --all-containers default-http-backend-767cdb5fdc-ptqhh -n kubesphere-controls-system
exec /server: exec format error
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 ~]# kubectl describe pods default-http-backend-767cdb5fdc-ptqh -n kubesphere-controls-system | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4
- 查看異常 Pod 鏡像架構
[root@ks-master-3 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4 | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像(第三方相同版本 ARM 鏡像)
crictl pull mirrorgooglecontainers/defaultbackend-arm64:1.4
- 鏡像重新打 tag(為了保持鏡像名稱風格一致)
ctr -n k8s.io images tag docker.io/mirrorgooglecontainers/defaultbackend-arm64:1.4 registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-arm64:1.4
- 重新部署組件
# 修改 Deployment 使用的鏡像,并重啟
kubectl set image deployment/default-http-backend default-http-backend=registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system
- 驗證新的 Pod 創(chuàng)建并啟動成功
[root@ks-master-1 ~]# kubectl get pods -o wide -n kubesphere-controls-system | grep default-http-backend
default-http-backend-694d6557b5-h674b 1/1 Running 0 14m 10.233.102.120 ks-master-2 <none> <none>
3.5 解決 Jenkins 異常
- 查看異常 Pod 的日志
[root@ks-master-1 ~]# kubectl logs --all-containers devops-jenkins-774fdb948b-4rk56 -n kubesphere-devops-system
exec /bin/sh: exec format error
Error from server (BadRequest): container "devops-jenkins" in pod "devops-jenkins-774fdb948b-4rk56" is waiting to start: PodInitializing
- 查看異常 Pod 的事件
[root@ks-master-1 ~]# kubectl events devops-jenkins-774fdb948b-4rk56 -n kubesphere-devops-system
LAST SEEN TYPE REASON OBJECT MESSAGE
40m Normal SuccessfulCreate CronJob/devops Created job devops-28284990
40m Normal SuccessfulCreate Job/devops-28284990 Created pod: devops-28284990-59cvp
40m Normal Scheduled Pod/devops-28284990-59cvp Successfully assigned kubesphere-devops-system/devops-28284990-59cvp to ks-master-1
40m Normal Pulling Pod/devops-28284990-59cvp Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/devops-tools:ks-v3.4.0"
40m Normal Pulled Pod/devops-28284990-59cvp Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/devops-tools:ks-v3.4.0" in 427.197317ms (427.222677ms including waiting)
40m Normal Created Pod/devops-28284990-59cvp Created container pipeline-run-gc
40m Normal Started Pod/devops-28284990-59cvp Started container pipeline-run-gc
40m Normal Completed Job/devops-28284990 Job completed
40m Normal SawCompletedJob CronJob/devops Saw completed job: devops-28284990, status: Complete
10m Normal SuccessfulCreate CronJob/devops Created job devops-28285020
10m Normal Scheduled Pod/devops-28285020-zqd72 Successfully assigned kubesphere-devops-system/devops-28285020-zqd72 to ks-master-1
10m Normal SuccessfulCreate Job/devops-28285020 Created pod: devops-28285020-zqd72
10m Normal Started Pod/devops-28285020-zqd72 Started container pipeline-run-gc
10m Normal Created Pod/devops-28285020-zqd72 Created container pipeline-run-gc
10m Normal Pulled Pod/devops-28285020-zqd72 Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/devops-tools:ks-v3.4.0" in 423.190976ms (423.205056ms including waiting)
10m Normal Pulling Pod/devops-28285020-zqd72 Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/devops-tools:ks-v3.4.0"
10m Normal Completed Job/devops-28285020 Job completed
10m Normal SuccessfulDelete CronJob/devops Deleted job devops-28284930
10m Normal SawCompletedJob CronJob/devops Saw completed job: devops-28285020, status: Complete
83s (x431 over 95m) Warning BackOff Pod/devops-jenkins-774fdb948b-4rk56 Back-off restarting failed container copy-default-config in pod devops-jenkins-774fdb948b-4rk56_kubesphere-devops-system(d4e1ad50-0930-4ea7-823d-200bd90df471)
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 ~]# kubectl describe pods devops-jenkins-774fdb948b-4rk56 -n kubesphere-devops-system | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1
- 查看異常 Pod 鏡像架構
[root@ks-master-3 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1 | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像(相近版本 kubesphere 官方 ARM 版鏡像)
# 沒有找到同版本的,只能找了一個相近版本的 ARM 架構的鏡像
crictl pull docker.io/kubesphere/ks-jenkins:v3.4.1-2.319.3 --platform arm64
# 確保 image 架構是 arm64
[root@ks-master-3 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1 | grep arch
"architecture": "arm64",
- 鏡像重新打 tag(為了保持鏡像名風格一致)
crictl rmi registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1
ctr -n k8s.io images tag docker.io/kubesphere/ks-jenkins:v3.4.1-2.319.3 registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins:v3.4.0-2.319.3-1
- 重新部署組件
# 查找現(xiàn)有 pod
[root@ks-master-3 ~]# kubectl get pods -A -o wide | grep jenkins
kubesphere-devops-system devops-jenkins-774fdb948b-fmmls 0/1 Init:CrashLoopBackOff 6 (43s ago) 6m28s 10.233.93.27 ks-master-3 <none> <none>
# 刪除 pod,系統(tǒng)會自動重建
kubectl delete pod devops-jenkins-774fdb948b-fmmls -n kubesphere-devops-system
3.6 解決 weave 異常
初始化部署時啟用了 weave,導致服務部署異常,需要做如下操作卸載 weave 插件。
注: 有能力的可以自己打對應版本的 ARM 鏡像,我放棄了,畢竟這個功能模塊用處不大了,該項目都已經不在維護了,停止維護說明。
- 查看異常 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide | grep weave
weave weave-scope-agent-78r2r 0/1 CrashLoopBackOff 5 (95s ago) 8m41s 172.16.33.16 ks-master-1 <none> <none>
weave weave-scope-agent-gcm6z 0/1 CrashLoopBackOff 5 (69s ago) 8m40s 172.16.33.23 ks-master-3 <none> <none>
weave weave-scope-agent-kpp46 0/1 CrashLoopBackOff 5 (79s ago) 8m40s 172.16.33.22 ks-master-2 <none> <none>
weave weave-scope-app-c6966bf4-c79n2 0/1 CrashLoopBackOff 5 (67s ago) 8m42s 10.233.93.10 ks-master-3 <none> <none>
weave weave-scope-cluster-agent-6f8f6596ff-xpctc 0/1 CrashLoopBackOff 5 (75s ago) 8m41s 10.233.102.7 ks-master-2 <none> <none>
- 查看異常 Pod 的日志
[root@ks-master-1 ~]# kubectl logs -n weave weave-scope-agent-78r2r
exec /home/weave/scope: exec format error
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 ~]# kubectl describe pod weave-scope-agent-78r2r -n weave | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/scope:1.13.0
- 查看異常 Pod 鏡像架構
[root@ks-master-1 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/scope:1.13.0 | grep arch
"architecture": "amd64",
- 編輯集群配置文件,
kubectl edit cc ks-installer -n kubesphere-system
network:
ippool:
type: calico
networkpolicy:
enabled: true
topology:
type: weave-scope # 修改為 none
- 修改完成后,會自動重新部署(也可以用下面的命令刪除 ks-installer 的 pod,手工強制重啟部署任務)
kubectl delete pod ks-installer-6674579f54-4s4tp -n kubesphere-system
- 觀察部署日志
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f
3.7 解決 metrics-server 異常
- 查看異常 Pod
[root@ks-master-1 kubekey]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-66b6cfb784-85l94 0/1 CrashLoopBackOff 38 (114s ago) 171m
- 查看異常 Pod 的日志
[root@ks-master-1 kubekey]# kubectl logs metrics-server-66b6cfb784-85l94 -n kube-system
exec /metrics-server: exec format error
- 查看異常 Pod 使用的鏡像
[root@ks-master-1 kubekey]# kubectl describe pod metrics-server-66b6cfb784-85l94 -n kube-system | grep Image:
Image: registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2
- 查看異常 Pod 鏡像架構
[root@ks-master-3 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2 | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像(相同版本 kubesphere 官方 ARM 版鏡像)
# 拉取 arm64 鏡像
crictl pull registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2-arm64 --platform arm64
- 鏡像重新打 tag
# 刪除 amd64 鏡像
crictl rmi registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2
# 重新打 tag
ctr -n k8s.io images tag registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2-arm64 registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2
- 重新部署組件
# 刪除 ks-install 重啟安裝任務
kubectl delete pod ks-installer-6674579f54-4s4tp -n kubesphere-system
3.8 解決 Minio 異常
- 查看異常 Pod
[root@ks-master-1 ~]# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kubesphere-system minio-757c8bc7f-tlnts 0/1 CrashLoopBackOff 5 (111s ago) 5m27s 10.233.103.4 ks-master-1 <none> <none>
kubesphere-system minio-make-bucket-job-fzz95 0/1 Error 2 (24s ago) 31s 10.233.93.5 ks-master-3 <none> <none>
- 查看異常 Pod 的日志
[root@ks-master-1 ~]# kubectl logs minio-757c8bc7f-tlnts -n kubesphere-system
exec /bin/sh: exec format error
- 查看異常 Pod 使用的鏡像
# Minio 組件對應兩個鏡像
[root@ks-master-1 ~]# crictl images ls | grep minio
registry.cn-beijing.aliyuncs.com/kubesphereio/minio RELEASE.2019-08-07T01-59-21Z 29c267893b048 23.1MB
[root@ks-master-3 ~]# crictl images ls | grep mc
registry.cn-beijing.aliyuncs.com/kubesphereio/mc RELEASE.2019-08-07T23-14-43Z c02b00df169fc 9.32MB
- 查看異常 Pod 鏡像架構
# 查看異常 Pod 鏡像架構(以 minio 為例)
[root@ks-master-1 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/minio:RELEASE.2019-08-07T01-59-21Z | grep arch
"architecture": "amd64",
- 獲取適配的 ARM 版鏡像(相近版本 minio 官方 ARM 版鏡像)
# 找個相近版本的 ARM 架構的鏡像
# minio
crictl pull minio/minio:RELEASE.2020-11-25T22-36-25Z-arm64
# mc
crictl pull minio/mc:RELEASE.2020-11-25T23-04-07Z-arm64
- 鏡像重新打 tag
# minio
crictl rmi registry.cn-beijing.aliyuncs.com/kubesphereio/minio:RELEASE.2019-08-07T01-59-21Z
ctr -n k8s.io images tag docker.io/minio/minio:RELEASE.2020-11-25T22-36-25Z-arm64 registry.cn-beijing.aliyuncs.com/kubesphereio/minio:RELEASE.2019-08-07T01-59-21Z
# mc
crictl rmi registry.cn-beijing.aliyuncs.com/kubesphereio/mc:RELEASE.2019-08-07T23-14-43Z
ctr -n k8s.io images tag --force docker.io/minio/mc:RELEASE.2020-11-25T23-04-07Z-arm64 registry.cn-beijing.aliyuncs.com/kubesphereio/mc:RELEASE.2019-08-07T23-14-43Z
- 重新部署組件
# 重新部署,刪除舊的 Pod,系統(tǒng)會自動創(chuàng)建新的(此步的操作也可以使用修改 minio 對應的 deployment 使用的鏡像名稱的方式)
kubectl delete pod minio-757c8bc7f-tlnts -n kubesphere-system
kubectl delete pod minio-make-bucket-job-fzz95 -n kubesphere-system
3.9 解決組件異常通用方案
在部署 ARM 的 KubeSphere 和 Kubernetes 集群時,遇到的異常多數(shù)都是因為鏡像架構不匹配造成的,當遇到本文沒有涉及的異常組件時,可以參考以下流程解決。
- 查看異常 Pod
- 查看異常 Pod 的日志
- 查看異常 Pod 使用的鏡像
- 查看異常 Pod 鏡像架構
- 獲取適配的 ARM 版鏡像
- 鏡像重新打 tag
- 重新部署組件
4. 部署測試資源
在解決完所有異常組件后,整個 Kubetnetes 集群中的 Pod 都應該處于 Running 狀態(tài),代表著 KubeSphere 和 Kubernetes 表面上看著是正常的。
在上篇文檔中我們已經驗證測試了 KubeSphere 和 Kubernetes 集群的狀態(tài)。
接下來我們將在 Kubernetes 集群上部署一個簡單的 Nginx Web 服務器,測試驗證 Kubernetes 和 KubeSphere 基本功能是否正常。
本示例使用命令行工具在 Kubernetes 集群上部署一個 Nginx Web 服務器并利用 KubeSphere 圖形化管理控制臺查看部署的資源信息。
4.1 創(chuàng)建 Nginx Deployment
運行以下命令創(chuàng)建一個部署 Nginx Web 服務器的 Deployment。此示例中,我們將創(chuàng)建具有兩個副本基于 nginx:alpine 鏡像的 Pod。
kubectl create deployment nginx --image=nginx:alpine --replicas=2
4.2 創(chuàng)建 Nginx Service
創(chuàng)建一個新的 Kubernetes 服務,服務名稱 nginx,服務類型 Nodeport,對外的服務端口 80。
kubectl create service nodeport nginx --tcp=80:80
4.3 驗證 Nginx Deployment 和 Pod
- 運行以下命令查看創(chuàng)建的 Deployment 和 Pod 資源。
kubectl get deployment -o wide
kubectl get pods -o wide
- 查看結果如下:
[root@ks-master-1 ~]# kubectl get deployment -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
nginx 2/2 2 2 20s nginx nginx:alpine app=nginx
[root@ks-master-1 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-6c557cc74d-tbw9c 1/1 Running 0 23s 10.233.102.187 ks-master-2 <none> <none>
nginx-6c557cc74d-xzzss 1/1 Running 0 23s 10.233.103.148 ks-master-1 <none> <none>
4.4 驗證 Nginx 鏡像架構
- 運行以下命令查看 Nginx Image 的架構類型
crictl inspecti nginx:alpine | grep architecture
- 查看結果如下:
[root@ks-master-1 ~]# crictl inspecti nginx:alpine | grep architecture
"architecture": "arm64"
4.5 驗證 Nginx Service
運行以下命令查看可用的服務列表,在列表中我們可以看到 nginx 服務類型 為 Nodeport,并在 Kubernetes 主機上開放了 30563 端口。
kubectl get svc -o wide
查看結果如下:
[root@ks-master-1 ~]# kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 4d22h <none>
nginx NodePort 10.233.14.48 <none> 80:30619/TCP 5s app=nginx
4.6 驗證服務
運行以下命令訪問部署的 Nginx 服務,驗證服務是否成功部署。
- 驗證直接訪問 Pod
curl 10.233.102.187
# 訪問結果如下
[root@ks-master-1 ~]# curl 10.233.102.187
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a >nginx.org</a>.<br/>
Commercial support is available at
<a >nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
- 驗證訪問 Service
curl 10.233.14.48
# 訪問結果同上,略
- 驗證訪問 Nodeport
curl 172.16.33.16:30619
# 訪問結果同上,略
4.7 在管理控制臺查看
接下來我們回到 KubeSphere 管理控制臺,在管理控制臺查看已經創(chuàng)建的資源。
說明: KubeSphere 的管理控制臺具有友好地、圖形化創(chuàng)建 Kubernetes 各種資源的功能,主要是截圖太麻煩了,所以本文采用了命令行的方式簡單的創(chuàng)建了測試資源。
只是在查看的時候給大家演示一下 KubeSphere 管理控制臺的基本功能,實際使用中,大家可以使用圖形化方式創(chuàng)建和管理 Kubernetes 資源。
- 登錄 KubeSphere 管理控制臺,點擊「平臺管理」,選擇「集群管理」。
- 單擊集群管理頁面左側的「應用負載」,點擊「工作負載」。默認會看到所有類型為部署的工作負載。
我們使用的是 admin 賬戶,因此可以看到所有的工作負載,在搜索框輸入 nginx,只顯示 nginx 部署工作負載。
- 單擊部署列表中的 nginx,可以查看更詳細的信息,并且管理 nginx 部署 (Deployment)。
- 單擊容器組中的一個 nginx 容器,可以查看容器的狀態(tài)、監(jiān)控等信息。
- 回到「平臺管理」-「集群管理」頁面,單擊集群管理頁面左側的「應用負載」,點擊「服務」。默認會看到所有類型為服務的工作負載。
我們使用的是 admin 賬戶,因此可以看到所有的工作負載,在搜索框輸入 nginx,只顯示 nginx 服務工作負載。
- 單擊服務列表中的 nginx,可以查看更詳細的信息,并且管理 nginx 服務 (Service)。
至此,我們實現(xiàn)了將 Nginx Web 服務器部署到 Kubernetes 集群,并通過 KubeSphere 管理控制臺查看、驗證了部署的 Deployment、Pod、Service 的詳細信息。
本文僅對 ARM 架構下部署的 KubeSphere 和 Kubernetes 做了最基本的資源創(chuàng)建的驗證測試,更多的完整的可插拔組件的測試并未涉及,請讀者根據(jù)需求自己驗證、測試。
在驗證測試過程中遇到的問題多數(shù)都應該是鏡像架構不匹配造成的,參考本文第 5 小節(jié)中解決問題的思路和流程,應該能解決大部分問題。
5. 補充說明
本小節(jié)我們展示一些集群的基本信息和一些測試驗證的說明。
5.1 查看 Image 列表
本小節(jié),我們查看一下到目前為止,整個 KubeSphere 和 Kubernetes 集群用到了哪些鏡像。
在每個節(jié)點輸入以下命令獲取在 Kubernetes 集群節(jié)點上已經下載的 Image 列表。
crictl images ls
以下結果是在所有執(zhí)行查看命令后的匯總輸出:
# crictl images ls | grep -v docker.io > /tmp/1.txt
# crictl images ls | grep -v docker.io > /tmp/2.txt
# crictl images ls | grep -v docker.io > /tmp/3.txt
# cat 1.txt 2.txt 3.txt | awk '{if (!seen[$1]++) {print}}' | sort
IMAGE TAG IMAGE ID SIZE
registry.cn-beijing.aliyuncs.com/kubesphereio/alertmanager v0.23.0 44a71f29f42b0 25MB
registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset-arm64 v0.4.1 758eaf0d9da26 76.2MB
registry.cn-beijing.aliyuncs.com/kubesphereio/argocd-applicationset v0.4.1 f5ac1e612edca 77.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/argocd v2.3.3 9a557f111ba0b 185MB
registry.cn-beijing.aliyuncs.com/kubesphereio/cni v3.23.2 dbad3c1a529ac 76.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/configmap-reload v0.7.1 b4151c08af07d 3.73MB
registry.cn-beijing.aliyuncs.com/kubesphereio/coredns 1.9.3 b19406328e70d 13.4MB
registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64 1.4 846921f0fe0e5 1.82MB
registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-arm64 1.4 156fbd4afe405 1.67MB
registry.cn-beijing.aliyuncs.com/kubesphereio/devops-apiserver ks-v3.4.0 c89baacbfca67 29.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/devops-controller ks-v3.4.0 5b4ce4e345d4a 26.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/devops-tools ks-v3.4.0 3b2bc61feca12 28.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/dex v2.30.2 cd0eb92a37da5 24.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/docker 19.03 59291a2ff3232 60.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/fluentbit-operator v0.14.0 1fe6aebe29cb9 18.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/fluent-bit v1.9.4 bb0200a3096d6 25.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/haproxy 2.3 5be62463c49bf 37.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/jaeger-operator 1.29 7e6f4ab1586b7 105MB
registry.cn-beijing.aliyuncs.com/kubesphereio/k8s-dns-node-cache 1.15.12 c5c27a390f5fa 41.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kiali-operator v1.50.1 c3dc2b49c5e8b 250MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kiali v1.50 97c5cab789c50 78MB
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-apiserver v3.4.0 de0e22c9bcf4f 61.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-console v3.4.0 42b2364bcafe3 38.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-controller-manager v3.4.0 4ff90f750485b 46.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-installer v3.4.0 3471e6e1d2af0 153MB
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-jenkins v3.4.0-2.319.3-1 4b52636ce9e50 581MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver v1.26.5 06a268629a822 32.4MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-auditing-operator v0.2.0 9c5612b5f8a8e 8.37MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-auditing-webhook v0.2.0 f23e1078e2b2c 11.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager v1.26.5 a7b6fa1339316 29.6MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controllers v3.23.2 5a33f410afb91 25.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kubectl v1.22.0 29c024bbbf9b1 25.2MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-events-exporter v0.6.0 452463fb02cb2 19.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-events-operator v0.6.0 27146d6e5ab73 21.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-events-ruler v0.6.0 12573f2ffe64d 25.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy v1.26.5 5e89f86edbb88 19.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-rbac-proxy v0.11.0 bee4dbe8a9f42 17.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler v1.26.5 9eb699cb795f8 16.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-state-metrics v2.6.0 150fa1ddd5ab3 10.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/linux-utils 3.3.0 d06b9d3a552bc 27.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/log-sidecar-injector v1.2.0 dbad25c47d9ec 9.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/mc RELEASE.2019-08-07T23-14-43Z c02b00df169fc 11.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server v0.4.2 b266eabac7d6a 23.6MB
registry.cn-beijing.aliyuncs.com/kubesphereio/minio RELEASE.2019-08-07T01-59-21Z 6760283c70e3e 23MB
registry.cn-beijing.aliyuncs.com/kubesphereio/node-exporter v1.3.1 bb203ba967a80 9.72MB
registry.cn-beijing.aliyuncs.com/kubesphereio/node v3.23.2 2a3416a8ae05d 73MB
registry.cn-beijing.aliyuncs.com/kubesphereio/notification-manager-operator v2.3.0 3d8285028607a 17.9MB
registry.cn-beijing.aliyuncs.com/kubesphereio/notification-manager v2.3.0 b4d8dc4ea25ac 20.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/notification-tenant-sidecar v3.2.0 e41b413830c35 13.2MB
registry.cn-beijing.aliyuncs.com/kubesphereio/openldap 1.3.0 8803fc65b2bfa 91.6MB
registry.cn-beijing.aliyuncs.com/kubesphereio/openpitrix-jobs v3.3.2 d1afba2bac100 15.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/opensearch 2.6.0 92fe7c48b27f4 813MB
registry.cn-beijing.aliyuncs.com/kubesphereio/opensearch-curator v0.0.5 5791cbce4e83d 19MB
registry.cn-beijing.aliyuncs.com/kubesphereio/pause 3.8 4e42fb3c9d90e 268kB
registry.cn-beijing.aliyuncs.com/kubesphereio/pilot 1.15.7 4f12fc8040fe7 66MB
registry.cn-beijing.aliyuncs.com/kubesphereio/pod2daemon-flexvol v3.23.2 d48543e4a525a 4.56MB
registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-config-reloader v0.55.1 4152fd84f8c36 4.61MB
registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-operator v0.55.1 8e6f6c78032e8 13.1MB
registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus v2.39.1 af3e34b4a84e5 83.4MB
registry.cn-beijing.aliyuncs.com/kubesphereio/provisioner-localpv 3.3.0 2f625755a998b 27.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/redis 6.2.6-alpine 50bbab999a871 10.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/s2ioperator v3.2.1 30b700dff15f8 11.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/scope 1.13.0 ca6176be9738f 30.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/snapshot-controller v4.0.0 3758cfc26c6db 17.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/thanos v0.31.0 5595f682eee8a 37.7MB
5.2 DevOps 模塊驗證說明
在解決了 Jenkins 等鏡像架構異常后,DevOps 流水線能創(chuàng)建項目、創(chuàng)建流水線,編輯 Jenkinsfile,流水線的流程圖也能正常顯示。
構建應用的時候 Maven 容器依舊起不來,暫時沒有找到解決方案,作為遺留問題,后續(xù)解決了再說。
6. 常見問題
6.1 問題 1
- 報錯現(xiàn)象
# 安裝失敗,最后報錯
Please wait for the installation to complete: >>--->
13:08:38 CST skipped: [ks-master-3]
13:08:38 CST skipped: [ks-master-2]
13:08:38 CST failed: [ks-master-1]
error: Pipeline[CreateClusterPipeline] execute failed: Module[CheckResultModule] exec failed:
failed: [ks-master-1] execute task timeout, Timeout=2h
# 查看 Pods
[root@ks-master-1 kubekey]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-66b6cfb784-85l94 0/1 CrashLoopBackOff 38 (114s ago) 171m
# 查看 Pod 日志
[root@ks-master-1 kubekey]# kubectl logs metrics-server-66b6cfb784-85l94 -n kube-system
exec /metrics-server: exec format error
- 解決方案
參考 3.7 小節(jié)的 解決 metrics-server 異常
6.2 問題 2
- 報錯信息
## devops 流水線功能異常,僅做記錄,未最終解決
Started by user opsman
[Pipeline] Start of Pipeline
[Pipeline] node
Still waiting to schedule task
All nodes of label ‘maven’ are offline
[root@ks-master-1 ~]# kubectl get pods -n kubesphere-devops-worker
NAME READY STATUS RESTARTS AGE
maven-604qz 1/2 Terminating 0 5s
maven-7krjb 1/2 Terminating 0 5s
maven-b43fl 1/2 Terminating 0 5s
5s Normal Started Pod/maven-pxrt3 Started container jnlp
4s Normal Started Pod/maven-xnt3j Started container jnlp
4s Normal Pulled Pod/maven-xnt3j Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/inbound-agent:4.10-2" already present on machine
4s Normal Started Pod/maven-xnt3j Started container maven
4s Normal Created Pod/maven-xnt3j Created container maven
4s Normal Created Pod/maven-xnt3j Created container jnlp
4s Normal Pulled Pod/maven-xnt3j Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/builder-maven:v3.2.0-podman" already present on machine
[root@ks-master-2 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/builder-maven:v3.2.0-podman | grep arch
"architecture": "amd64",
[root@ks-master-2 ~]# crictl inspecti registry.cn-beijing.aliyuncs.com/kubesphereio/inbound-agent:4.10-2 | grep arch
"architecture": "arm64",
- 解決方案
# 說明:沒有最終解決,只解決了鏡像架構異常,但是后面構建應用的時候 maven 容器依舊起不來
crictl pull kubespheredev/builder-maven:v3.3.1-podman --platform arm64
[root@ks-master-2 ~]# crictl inspecti kubespheredev/builder-maven:v3.3.1-podman | grep arch | head -1
"architecture": "arm64",
# 偷梁換柱,強制修改 tag(這個方案比較粗暴,建議采取自己構建同版本 ARM 鏡像或是修改 kubesphere-devops-system 項目下的配置字典 jenkins-casc-config 中的 jenkins_user.yaml ,修改鏡像版本)
crictl rmi registry.cn-beijing.aliyuncs.com/kubesphereio/builder-maven:v3.2.0-podman
ctr -n k8s.io images tag docker.io/kubespheredev/builder-maven:v3.3.1-podman registry.cn-beijing.aliyuncs.com/kubesphereio/builder-maven:v3.2.0-podman --force
#ctr -n k8s.io images tag docker.io/kubespheredev/builder-maven:v3.3.1-podman registry.cn-beijing.aliyuncs.com/kubesphereio/builder-maven:v3.3.1-podman
7. 總結
本專題主要實戰(zhàn)演示了在 ARM 版 openEuler 22.03 LTS SP2 服務器上,利用 KubeKey v3.0.10 自動化部署最小化 KubeSphere v3.4.0 和 Kubernetes v1.26.5 高可用集群的詳細過程。
部署完成后,我們還利用 KubeSphere 管理控制臺和 kubectl 命令行,查看并驗證了 KubeSphere 和 Kubernetes 集群的狀態(tài)。
最終我們通過在 Kubenetes 集群上部署 Nginx Web 服務器驗證了 Kubernetes 集群和 KubeSphere 的可用性,并通過在 KubeSphere 管理控制臺查看 Nginx Pod 和服務狀態(tài)的操作,了解了 KubeSphere 的基本用法。
概括總結全文主要涉及以下內容:
- openEuler 22.03 LTS SP2 aarch64 操作系統(tǒng)基礎配置
- 操作系統(tǒng)數(shù)據(jù)盤 LVM 配置、磁盤掛載、數(shù)據(jù)目錄創(chuàng)建
- KubeKey 下載及創(chuàng)建集群配置文件
- 利用 KubeKey 自動化部署 KubeSphere 和 Kubernetes 集群
- 解決 ARM 版 KubeSphere 和 Kubernetes 服務組件異常的問題
- 部署完成后的 KubeSphere 和 Kubernetes 集群狀態(tài)驗證
- 部署 Nginx 驗證測試 KubeSphere 和 Kubernetes 基本功能
本文部署環(huán)境雖然是基于 Kunpeng-920 芯片的 aarch64 版 openEuler 22.03 LTS SP2 ,但是對于 CentOS、麒麟 V10 SP2 等 ARM 版操作系統(tǒng)以及飛騰(FT-2500)等芯片也有一定的借鑒意義。
本文介紹的內容可直接用于研發(fā)、測試環(huán)境,對于生產環(huán)境有一定的參考意義,絕對不能直接用于生產環(huán)境。
本文的不完全測試結論: KubeSphere 和 Kubernetes 基本功能可用,DevOps 功能部分可用,主要問題在構建鏡像時 Maven 容器啟動異常,其他插件功能未做驗證。文章來源:http://www.zghlxwxcb.cn/news/detail-738354.html
本文由博客一文多發(fā)平臺 OpenWrite 發(fā)布!文章來源地址http://www.zghlxwxcb.cn/news/detail-738354.html
到了這里,關于ARM 版 OpenEuler 22.03 部署 KubeSphere v3.4.0 不完全指南續(xù)篇的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網!