1.簡介
Docker Swarm 是 Docker 的集群管理工具。它將 Docker 主機池轉(zhuǎn)變?yōu)閱蝹€虛擬 Docker 主機。 Docker Swarm 提供了標準的 Docker API,所有任何已經(jīng)與 Docker 守護程序通信的工具都可以使用 Swarm 輕松地擴展到多個主機。支持的工具包括但不限于以下各項:
- Dokku
- Docker Compose
- Docker Machine
- Jenkins
原理如下圖所示,swarm 集群由管理節(jié)點(manager)和工作節(jié)點(work node)構(gòu)成。
- swarm manager:負責(zé)整個集群的管理工作包括集群配置、服務(wù)管理等所有跟集群有關(guān)的工作。
- work node:即圖中的 available node,主要負責(zé)運行相應(yīng)的服務(wù)來執(zhí)行任務(wù)(task)。
2.使用
當(dāng)前的系統(tǒng)和Docker版本,以下測試均基于此環(huán)境:
# 系統(tǒng)版本
[root@tcloud ~]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
# Docker版本
[root@tcloud ~]# docker --version
Docker version 20.10.13, build a224086
2.1 初始化
首先,您需要在 Docker 主機上初始化 Swarm??梢酝ㄟ^以下命令完成:
# 初始化命令
docker swarm init
# 實測
[root@tcloud ~]# docker swarm init
Swarm initialized: current node (xab6g9fgtm86u830xc6y0ydib) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-46uvq1siqzb59k6gs0hfhmq02ndemtuze3obv0z7rm0wt7ca6l-56ubevf6f4pmlytic9170fywj xxx.xx.x.x:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
執(zhí)行此命令后,Docker 將當(dāng)前節(jié)點設(shè)置為 Swarm 的管理節(jié)點,并生成一個加入 Swarm 的token令牌。這個令牌用于將其他節(jié)點加入 Swarm。
# 查看Docker信息
[root@tcloud ~]# docker info
# Swarm相關(guān)信息
Server:
Swarm: active
NodeID: xab6g9fgtm86u830xc6y0ydib
Is Manager: true
ClusterID: d62ha22wptcgugdzlg8zn8892
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: xxx.xx.x.x
Manager Addresses:
xxx.xx.x.x:2377
忘記令牌可以在Manager節(jié)點進行查詢:
# 查詢命令【To add a manager to this swarm】
docker swarm join-token manager
# 實測
[root@tcloud ~]# docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join --token SWMTKN-1-46uvq1siqzb59k6gs0hfhmq02ndemtuze3obv0z7rm0wt7ca6l-b1wor0jtm59dgkbj87y3vo6pa xxx.xx.x.x:2377
# 查詢命令【To add a worker to this swarm】
docker swarm join-token worker
# 實測
[root@tcloud ~]# docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-46uvq1siqzb59k6gs0hfhmq02ndemtuze3obv0z7rm0wt7ca6l-56ubevf6f4pmlytic9170fywj xxx.xx.x.x:2377
2.2 添加節(jié)點
節(jié)點的系統(tǒng)和Docker版本【與Manager的系統(tǒng)和Docker版本都不一致】:
# 系統(tǒng)版本
CentOS Linux release 7.7.1908 (Core)
# Docker版本
Docker version 17.06.2-ce, build cec0b72
在節(jié)點上,使用之前生成的令牌將它們加入到 Swarm。例如:
# 添加命令
docker swarm join --token <TOKEN> <MANAGER-IP>:<PORT>
# 實例
[root@localhost deepface]# docker swarm join --token SWMTKN-1-46uvq1siqzb59k6gs0hfhmq02ndemtuze3obv0z7rm0wt7ca6l-56ubevf6f4pmlytic9170fywj xxx.xx.x.x:2377
This node joined a swarm as a worker.
添加節(jié)點后在Manager上查詢數(shù)據(jù)如下:
# 可以看到節(jié)點數(shù)變?yōu)?了
Server:
Swarm: active
Managers: 1
Nodes: 2
使用node
命令查看信息如下:
# 查詢命令
docker node ls
# 查詢結(jié)果
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
j3ey43drdbcndd5o6ondp3r9g localhost.localdomain Ready Active 17.06.2-ce
lvrjegiuhr6tllt8hr3ls50v3 * nodexxx Ready Active Leader 20.10.13
# 刪除節(jié)點命令
docker node rm <node-id>
# 排查節(jié)點命令
docker node update --availability drain <node-id>
# 重新啟用節(jié)點
docker node update --availability active <node-id>
# 排除節(jié)點后
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
j3ey43drdbcndd5o6ondp3r9g localhost.localdomain Ready Drain 17.06.2-ce
lvrjegiuhr6tllt8hr3ls50v3 * nodexxx Ready Active Leader 20.10.13
ahgt6tmv8d9o6vp0i0c20nxed nodexxx Ready Active 20.10.13
2.3 創(chuàng)建服務(wù)
創(chuàng)建一個 Docker 服務(wù),定義容器的規(guī)模、鏡像、端口映射等。例如:
docker service create --replicas 3 -p 80:80 --name web nginx:latest
此命令創(chuàng)建一個名為 “web” 的服務(wù),使用 Nginx 鏡像,在集群中運行 3 個副本,并將主機的端口 80 映射到服務(wù)的端口 80。以下為實例:
# 單機版的服務(wù)啟動命令
docker run --name deepface --privileged --restart=always --net="host" -v /root/.deepface/weights/:/root/.deepface/weights/ -d deepface_image
# 集群版啟動
docker service create --replicas 4 --name deepface_cluster --publish 5000:5000 --mount type=bind,source=/root/.deepface/weights/,target=/root/.deepface/weights/ -d deepface_image
docker service create --replicas 2 --name deepface_cluster --network host --mount type=bind,source=/root/.deepface/weights/,target=/root/.deepface/weights/ -d deepface_image
注意:如果副本數(shù)大于節(jié)點數(shù)不可使用--network host
會因為端口占用而無法啟動節(jié)點。--replicas
的值和節(jié)點數(shù)并不需要嚴格一樣。在 Docker Swarm 中,--replicas
參數(shù)用于指定服務(wù)的副本數(shù)量,而不是節(jié)點數(shù)。副本是指服務(wù)在整個 Swarm 集群中運行的實例數(shù)量。當(dāng)您創(chuàng)建一個服務(wù)時,可以指定 --replicas
參數(shù)來定義服務(wù)的規(guī)模,但這并不要求集群中的每個節(jié)點都會運行一個副本。Swarm 將會在可用的節(jié)點上自動進行副本的調(diào)度,以滿足指定的數(shù)量。例如,如果您有一個包含 3 個節(jié)點的 Swarm 集群,并且執(zhí)行以下命令:
docker service create --replicas 5 --name my-service my-image
這將在集群的某些節(jié)點上調(diào)度 5 個副本,而不是每個節(jié)點上都運行一個。Swarm 會在集群中選擇可用的節(jié)點,并在這些節(jié)點上啟動服務(wù)的副本,以實現(xiàn)規(guī)模為 5。
2.4 查看服務(wù)狀態(tài)
可以使用以下命令來查看 Swarm 集群中的服務(wù)狀態(tài):
# 查詢命令
docker service ls
# 查詢結(jié)果
ID NAME MODE REPLICAS IMAGE PORTS
kjbpehilmg98 deepface_cluster replicated 4/4 deepface_image:latest
這會列出所有運行的服務(wù)以及它們的狀態(tài)和規(guī)模。
2.5 擴展或縮減服務(wù)
可以通過以下命令來擴展或縮減服務(wù)的規(guī)模:
docker service scale web=5
這將將 “web” 服務(wù)的規(guī)模擴展到 5 個副本。以下為實踐內(nèi)容:
2.6 更新服務(wù)
需要更新服務(wù),例如更改容器使用的鏡像,可以使用以下命令:
# 此命令未進行測試
docker service update --image new-image:tag web
2.6 刪除服務(wù)和停止 Swarm
最后,要刪除服務(wù),可以使用以下命令:
# 刪除服務(wù)命令
docker service rm web
# 測試
[root@node178 /root]$ docker service rm deepface_cluster
deepface_cluster
# 查詢并刪除全部服務(wù)
docker service rm $(docker service ls -q)
要停止 Swarm,可以在管理節(jié)點上運行以下命令:
# 停止Swarm命令
docker swarm leave --force
3.報錯
3.1 OpenBLAS blas_thread_init: pthread_create failed for thread x of xx: Operation not permitted
deepface_cluster.1.wtmx0mm6vu2f@localhost.localdomain | OpenBLAS blas_thread_init: pthread_create failed for thread x of xx: Operation not permitted
deepface_cluster.1.wtmx0mm6vu2f@localhost.localdomain | OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max
由于節(jié)點的Docker版本為17.06.2-ce, build cec0b72
在執(zhí)行以下命令啟動容器時:
docker run --name deepface --privileged --restart=always --net="host" -v /root/.deepface/weights/:/root/.deepface/weights/ -d deepface_image
兩個版本的容器均運行正常,但使用下邊的命令啟動容器時,去掉了--privileged
:
docker run --name deepface --restart=always --net="host" -v /root/.deepface/weights/:/root/.deepface/weights/ -d deepface_image
此時,較低版本的會有如下報錯:
OpenBLAS blas_thread_init: pthread_create failed for thread x of xx: Operation not permitted
OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max
3.2 節(jié)點的服務(wù)未被調(diào)用
docker swarm cluster 各結(jié)點間通信依賴于三個端口2377,7946,4789
其中:
- tcp端口2377:為集群管理端口
- tcp與udp端口7946:為節(jié)點之間通訊端口
- tcp與udp端口4789:為overlay網(wǎng)絡(luò)通訊端口(overlay網(wǎng)絡(luò)為二層與三層之間新型通訊技術(shù))
iptables -A INPUT -p tcp --dport 2377 -j ACCEPT
iptables -A INPUT -p tcp --dport 7946 -j ACCEPT
iptables -A INPUT -p udp --dport 7946 -j ACCEPT
iptables -A INPUT -p tcp --dport 4789 -j ACCEPT
iptables -A INPUT -p udp --dport 4789 -j ACCEPT
放開端口后未解決問題。可能是啟動服務(wù)時的參數(shù)--network host
導(dǎo)致的,進行以下嘗試:
# 1.創(chuàng)建專用網(wǎng)絡(luò)
docker network create -d overlay deepface_network
# 查詢網(wǎng)絡(luò)
docker network ls
# 2.重新啟動服務(wù)
docker service create --replicas 2 --name deepface_cluster --network deepface_network --publish 5000:5000 --mount type=bind,source=/root/.deepface/weights/,target=/root/.deepface/weights/ -d deepface_image
Manager
啟動失敗:
# 通過以下命令查看
journalctl -u docker.service
# 報錯信息
level=error msg="Failed creating ingressingress network: error creating external connectivity network: Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables --wait -t nat -I DOCKER -i docker_gwbridge -j RETURN: iptables: No chain/target/match by that name.\n (exit status 1))"
level=error msg="Failed creating ingress network: error creating external connectivity network: Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables --wait -t nat -I DOCKER -i docker_gwbridge -j RETURN: iptables: No chain/target/match by that name.\n (exit status 1))"
level=error msg="failed removing container name resolution for c09097de449b90705f7022e7a9ece84d535fcd955a0858817e4892d429ac0182 epRec:{ingress-endpoint <nil> 10.0.0.34 [] [] [] false} err:network w8w5u3v5ny4cirzk5hfkvfu5q not found"
level=error msg="Not continuing with pull after error: errors:\ndenied: requested access to the resource is denied\nunauthorized: authentication required\n"
level=info msg="Ignoring extra error returned from registry: unauthorized: authentication required"
level=error msg="pulling image failed" error="pull access denied for deepface_image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied" module=node/agent/taskmanager node.id=lvrjegiuhr6tllt8hr3ls50v3 service.id=xh66bfakl21z8rxb1oof0pdp2 task.id=oj16bpvpxgdzxmpo6jf8we117
level=warning msg="IPv4 forwarding is disabled. Networking will not work."
level=warning msg="IPv4 forwarding is disabled. Networking will not work."
level=error msg="1aab2faa1bbe2b0857062a4c211d4abc308338f1a367936665b266c3d44d5d69 cleanup: failed to delete container from containerd: no such container"
level=error msg="fatal task error" error="starting container failed: error creating external connectivity network: Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables --wait -t nat -I DOCKER -i docker_gwbridge -j RETURN: iptables: No chain/target/match by that name.\n (exit status 1))" module=node/agent/taskmanager node.id=l
level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint icoywtf0kcknyw0og5samm59j 90904c66a7adb5751ace11266f6482fe47b1e8f813cd89085f1843b44153371c], retrying...."
level=warning msg="rmServiceBinding handleEpTableEvent deepface_cluster abc833b8b12d6c3b23ec293eb24f8750c2ca14d9f4a1d8da4f52b8f6a3145d07 aborted s.loadBalancers[nid] !ok"
解決:文章來源:http://www.zghlxwxcb.cn/news/detail-715366.html
# 重啟docker
systemctl restart docker
# 重新啟動服務(wù)
docker service create --replicas 2 --name deepface_cluster --network deepface_network --publish 5000:5000 --mount type=bind,source=/root/.deepface/weights/,target=/root/.deepface/weights/ -d deepface_image
成功!文章來源地址http://www.zghlxwxcb.cn/news/detail-715366.html
到了這里,關(guān)于Docker【部署 06】Swarm實踐及Operation not permitted和No chain/target/match by that name問題處理的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!