国产 无码 综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ)

這篇具有很好參考價(jià)值的文章主要介紹了Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ)。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方,請(qǐng)大家不吝賜教,您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。

1. 環(huán)境:

ceph:octopus
OS:Kylin-Server-V10_U1-Release-Build02-20210824-GFB-x86_64、CentOS Linux release 7.9.2009


2. ceph和cephadm

2.1 ceph簡(jiǎn)介

Ceph可用于向云平臺(tái)提供對(duì)象存儲(chǔ)、塊設(shè)備服務(wù)和文件系統(tǒng)。所有Ceph存儲(chǔ)集群部署都從設(shè)置每個(gè)Ceph節(jié)點(diǎn)開始,然后設(shè)置網(wǎng)絡(luò)。

Ceph存儲(chǔ)集群要求:至少有一個(gè)Ceph Monitor和一個(gè)Ceph Manager,并且至少有與Ceph集群上存儲(chǔ)的對(duì)象副本一樣多的Ceph osd(例如,如果一個(gè)給定對(duì)象的三個(gè)副本存儲(chǔ)在Ceph集群上,那么該Ceph集群中必須至少存在三個(gè)osd)。

Monitors:Ceph監(jiān)視器(ceph-mon)維護(hù)集群狀態(tài)的映射,包括監(jiān)視器映射、管理器映射、OSD映射、MDS映射和CRUSH映射。這些映射是Ceph守護(hù)進(jìn)程相互協(xié)調(diào)所需的關(guān)鍵集群狀態(tài)。監(jiān)視器還負(fù)責(zé)管理守護(hù)進(jìn)程和客戶端之間的身份驗(yàn)證。通常需要至少三個(gè)監(jiān)視器來實(shí)現(xiàn)冗余和高可用性。

Managers:Ceph Manager守護(hù)進(jìn)程(ceph-mgr)負(fù)責(zé)跟蹤C(jī)eph集群的運(yùn)行時(shí)指標(biāo)和當(dāng)前狀態(tài),包括存儲(chǔ)利用率、當(dāng)前性能指標(biāo)和系統(tǒng)負(fù)載。Ceph Manager守護(hù)進(jìn)程還承載基于python的模塊來管理和公開Ceph集群信息,包括基于web的Ceph Dashboard和REST API。通常需要至少兩個(gè)管理器來實(shí)現(xiàn)高可用性。

Ceph OSD:對(duì)象存儲(chǔ)守護(hù)進(jìn)程(Ceph OSD, ceph-osd)存儲(chǔ)數(shù)據(jù),處理數(shù)據(jù)復(fù)制、恢復(fù)、再平衡,并通過檢查其他Ceph OSD守護(hù)進(jìn)程的心跳,為Ceph監(jiān)視器和管理器提供一些監(jiān)控信息。通常需要至少三個(gè)Ceph osd來實(shí)現(xiàn)冗余和高可用性。

MDSs: Ceph元數(shù)據(jù)服務(wù)器(MDS,ceph-mds)代表Ceph文件系統(tǒng)存儲(chǔ)元數(shù)據(jù)(即,Ceph塊設(shè)備和Ceph對(duì)象存儲(chǔ)不使用MDS)。Ceph元數(shù)據(jù)服務(wù)器允許POSIX文件系統(tǒng)用戶執(zhí)行基本命令(如ls、find等),而不會(huì)給Ceph存儲(chǔ)集群帶來巨大的負(fù)擔(dān)。

參考官方文檔ceph

2.2 ceph releases

Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ),kylin,ceph,分布式

2.3 cephadm

cephadm是一個(gè)用于管理Ceph集群的實(shí)用程序。

  • cephadm可以向集群中添加Ceph容器。
  • cephadm可以從集群中刪除Ceph容器。
  • cephadm可以更新Ceph容器。

cephadm不依賴于外部配置工具,如Ansible、Rook或Salt。但是,這些外部配置工具可以用于自動(dòng)化不是由cephadm本身執(zhí)行的操作。

cephadm管理Ceph集群的整個(gè)生命周期。這個(gè)生命周期從引導(dǎo)過程開始,當(dāng)cephadm在單個(gè)節(jié)點(diǎn)上創(chuàng)建一個(gè)小型Ceph集群時(shí)。該集群由一個(gè)監(jiān)視器和一個(gè)管理器組成。然后,cephadm使用編排接口擴(kuò)展集群,添加主機(jī)并提供Ceph守護(hù)進(jìn)程和服務(wù)。這個(gè)生命周期的管理可以通過Ceph命令行界面(CLI)或儀表板(GUI)來執(zhí)行。

參考官方文檔cephadm

3. 節(jié)點(diǎn)規(guī)劃

主機(jī)名 地址 角色
ceph1 172.25.0.141 mon, osd, mds, mgr, iscsi, cephadm
ceph2 172.25.0.142 mon, osd, mds, mgr, iscsi
ceph3 172.25.0.143 mon, osd, mds, mgr, iscsi

4. 基礎(chǔ)環(huán)境配置

4.1 地址配置

ceph1配置地址

nmcli connection modify ens33 ipv4.method manual ipv4.addresses 172.25.0.141/24 ipv4.gateway 172.25.0.2 connection.autoconnect yes

ceph2配置地址

nmcli connection modify ens33 ipv4.method manual ipv4.addresses 172.25.0.142/24 ipv4.gateway 172.25.0.2 connection.autoconnect yes

ceph3配置地址

nmcli connection modify ens33 ipv4.method manual ipv4.addresses 172.25.0.143/24 ipv4.gateway 172.25.0.2 connection.autoconnect yes

4.2 主機(jī)名配置

ceph1配置主機(jī)名

hostnamectl set-hostname ceph1

ceph2配置主機(jī)名

hostnamectl set-hostname ceph2

ceph3配置主機(jī)名

hostnamectl set-hostname ceph3

ceph1、ceph2、ceph3均配置hosts
vim /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

172.25.0.141	ceph1
172.25.0.142	ceph2
172.25.0.143	ceph3

添加DNS解析,任意添加一個(gè)
vim /etc/resolv.conf

nameserver 223.5.5.5

4.3 防火墻

ceph1、ceph2、ceph3均需配置

關(guān)閉防火墻

systemctl disable --now firewalld
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

4.4 配置免密

ssh-keygen -f /root/.ssh/id_rsa -P ''
ssh-copy-id -o StrictHostKeyChecking=no 172.25.0.141
ssh-copy-id -o StrictHostKeyChecking=no 172.25.0.142
ssh-copy-id -o StrictHostKeyChecking=no 172.25.0.143

4.5 配置時(shí)間同步

ceph1、ceph2、ceph3均需安裝軟件包

yum -y install chrony
systemctl enable chronyd

ceph1作為服務(wù)端
vim /etc/chrony.conf

pool pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 172.25.0.0/24
local stratum 10
keyfile /etc/chrony.keys
leapsectz right/UTC
logdir /var/log/chrony

重啟服務(wù)

systemctl restart chronyd

ceph2、ceph2及其他后續(xù)節(jié)點(diǎn)作為客戶端
vim /etc/chrony.conf

pool 172.25.0.141 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
keyfile /etc/chrony.keys
leapsectz right/UTC
logdir /var/log/chrony

重啟服務(wù)

systemctl restart chronyd

使用客戶端進(jìn)行驗(yàn)證

chronyc sources -v
210 Number of sources = 1

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| /   '?' = unreachable, 'x' = time may be in error, '~' = time too variable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* ceph1                        11   6    17    10    -68us[  -78us] +/- 2281us


4.6 安裝python3

4.6.1 Kylin V10

系統(tǒng)已經(jīng)默認(rèn)安裝了python 3.7.4的版本,若未安裝,配置好源后通過YUM安裝

yum -y install python3

4.6.2 CentOS 7

安裝python3

yum -y install epel-release
yum -y install python3

4.6 安裝配置docker

4.6.1 Kylin V10

系統(tǒng)已經(jīng)默認(rèn)安裝了docker-engine的版本,若未安裝,配置好源后通過YUM安裝

yum -y install docker-engine

4.6.2 CentOS 7

配置docker repo

yum -y install yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce

設(shè)置開機(jī)自啟

systemctl enable docker

5. 安裝cephadm

5.1 Kylin V10

下載cephadm軟件包

wget http://mirrors.163.com/ceph/rpm-octopus/el7/noarch/cephadm-15.2.17-0.el7.noarch.rpm

安裝cephadm

chmod 600 /var/log/tallylog
rpm -ivh cephadm-15.2.17-0.el7.noarch.rpm

5.2 CentOS 7

通過cephadm腳本授予執(zhí)行權(quán)限

curl https://raw.githubusercontent.com/ceph/ceph/v15.2.1/src/cephadm/cephadm -o cephadm
chmod +x cephadm

或者

wget http://mirrors.163.com/ceph/rpm-octopus/el7/noarch/cephadm

基于發(fā)行版的名稱配置ceph倉(cāng)庫

./cephadm add-repo --release octopus

執(zhí)行cephadm安裝程序

./cephadm install

安裝ceph-common軟件包

cephadm install ceph-common

octopus軟件包地址:

https://repo.huaweicloud.com/ceph/rpm-octopus/
http://mirrors.163.com/ceph/rpm-octopus/
http://mirrors.aliyun.com/ceph/rpm-octopus/

5.3 cephadm修改

離線使用本地的鏡像,需要修改cephadm文件_pull_image函數(shù)的cmd列表中的pull,將其修改為images。
vim /usr/sbin/cephadm
Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ),kylin,ceph,分布式

6. 安裝ceph

6.1 準(zhǔn)備鏡像

6.1.1 導(dǎo)入鏡像

離線鏡像已放置在百度云盤:

鏈接:https://pan.baidu.com/s/1UEkQo0XrwuCDI5u9H8sGkQ?pwd=zsd4 
提取碼:zsd4 

離線load導(dǎo)入鏡像

docker load < ceph-v15.img
docker load < prometheus-v2.18.1.img
docker load < node-exporter-v0.18.1.img
docker load < ceph-grafana-6.7.4.img
docker load < alertmanager-v0.20.0.img

在線pull鏡像octopus

docker pull quay.io/ceph/ceph:v15
docker pull quay.io/prometheus/prometheus:v2.18.1
docker pull quay.io/prometheus/node-exporter:v0.18.1
docker pull quay.io/ceph/ceph-grafana:6.7.4
docker pull quay.io/prometheus/alertmanager:v0.20.0

quincy

docker pull quay.io/ceph/ceph:v17
docker pull quay.io/ceph/ceph-grafana:8.3.5
docker pull quay.io/prometheus/prometheus:v2.33.4
docker pull quay.io/prometheus/node-exporter:v1.3.1
docker pull quay.io/prometheus/alertmanager:v0.23.0

6.1.2 構(gòu)建registryserver

docker load < registry-2.img
mkdir -p /data/registry/
docker run -d -p 4000:5000 -v /data/registry/:/var/lib/registry/ --restart=always --name registry registry:2

添加主機(jī)名解析
vim /etc/hosts

172.25.0.141    registryserver

注意:這個(gè)主機(jī)名簡(jiǎn)析需要放到第一行,便于cephadm shell使用

6.1.3 tag鏡像及push進(jìn)倉(cāng)庫

docker tag quay.io/ceph/ceph:v15 registryserver:4000/ceph/ceph:v15 
docker tag quay.io/prometheus/prometheus:v2.18.1 registryserver:4000/prometheus/prometheus:v2.18.1
docker tag quay.io/prometheus/node-exporter:v0.18.1 registryserver:4000/prometheus/node-exporter:v0.18.1
docker tag quay.io/ceph/ceph-grafana:6.7.4 registryserver:4000/ceph/ceph-grafana:6.7.4
docker tag quay.io/prometheus/alertmanager:v0.20.0 registryserver:4000/prometheus/alertmanager:v0.20.0

docker push registryserver:4000/ceph/ceph:v15 
docker push registryserver:4000/prometheus/prometheus:v2.18.1
docker push registryserver:4000/prometheus/node-exporter:v0.18.1
docker push registryserver:4000/ceph/ceph-grafana:6.7.4
docker push registryserver:4000/prometheus/alertmanager:v0.20.0

6.2 bootstrap部署

mkdir -p /etc/ceph
cephadm bootstrap --mon-ip 172.25.0.141

該命令執(zhí)行以下操作:

  • 在本地主機(jī)上為新集群創(chuàng)建monitor 和 manager daemon守護(hù)程序。
  • 為Ceph集群生成一個(gè)新的SSH密鑰,并將其添加到root用戶的/root/.ssh/authorized_keys文件中。
  • 將與新群集進(jìn)行通信所需的最小配置文件保存到/etc/ceph/ceph.conf。
  • 向/etc/ceph/ceph.client.admin.keyring寫入client.admin管理(特權(quán)!)secret key的副本。
  • 將public key的副本寫入/etc/ceph/ceph.pub。
[root@ceph1 ~]# cephadm bootstrap --mon-ip 172.25.0.141
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: d20e3700-2d2f-11ee-9166-000c29aa07d2
Verifying IP 172.25.0.141 port 3300 ...
Verifying IP 172.25.0.141 port 6789 ...
Mon IP 172.25.0.141 is in CIDR network 172.25.0.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr not available, waiting (4/10)...
mgr not available, waiting (5/10)...
mgr not available, waiting (6/10)...
mgr not available, waiting (7/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 5...
Mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host ceph1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 13...
Mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

             URL: https://ceph1:8443/
            User: admin
        Password: cx763mtlk7

You can access the Ceph CLI with:

        sudo /usr/sbin/cephadm shell --fsid d20e3700-2d2f-11ee-9166-000c29aa07d2 -c /etc/ceph/ceph.conf -k /etc/c

Please consider enabling telemetry to help improve Ceph:

        ceph telemetry on

For more information see:

        https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

根據(jù)上文輸出的URL、User、Password登錄網(wǎng)頁,如下
Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ),kylin,ceph,分布式
首次登錄需要修改密碼,修改密碼之后登錄如下:
Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ),kylin,ceph,分布式

通過cephadm shell啟用ceph命令,也可以通過創(chuàng)建別名:

alias ceph='cephadm shell -- ceph'

而后直接在物理機(jī)上執(zhí)行ceph -s

6.2 主機(jī)管理

6.2.1 查看可用主機(jī)

ceph orch host ls

[ceph: root@ceph1 /]# ceph orch host ls
HOST   ADDR   LABELS  STATUS
ceph1  ceph1

6.2.1 添加主機(jī)

將主機(jī)添加到集群中

ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3
scp -r /etc/ceph root@ceph2:/etc/
scp -r /etc/ceph root@ceph3:/etc/
ceph orch host add ceph1 172.25.0.141
ceph orch host add ceph2 172.25.0.142
ceph orch host add ceph3 172.25.0.143

ceph orch host ls

[ceph: root@ceph1 /]# ceph orch host ls
HOST   ADDR          LABELS  STATUS
ceph1  172.25.0.141
ceph2  172.25.0.142
ceph3  172.25.0.143

6.2.3 刪除主機(jī)

從 環(huán)境中刪除主機(jī),如ceph3,確保正在運(yùn)行的服務(wù)都已經(jīng)停止和刪除:

ceph orch host rm ceph3

6.3 部署MONS和MGRS

6.3.1 mon指定特定子網(wǎng)

ceph config set mon public_network  172.25.0.0/24

6.3.2 更改默認(rèn)mon的數(shù)量

通過ceph orch ls 查看到相應(yīng)服務(wù)的狀態(tài),如mon

[ceph: root@ceph1 /]# ceph orch ls
NAME           RUNNING  REFRESHED  AGE  PLACEMENT    IMAGE NAME                                IMAGE ID
alertmanager       1/1  9m ago     6h   count:1      quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f
crash              2/3  9m ago     6h   *            quay.io/ceph/ceph:v15                     93146564743f
grafana            1/1  9m ago     6h   count:1      quay.io/ceph/ceph-grafana:6.7.4           557c83e11646
mgr                1/2  9m ago     6h   count:2      quay.io/ceph/ceph:v15                     93146564743f
mon                2/5  9m ago     6h   count:5      quay.io/ceph/ceph:v15                     93146564743f
node-exporter      1/3  10m ago    6h   *            quay.io/prometheus/node-exporter:v0.18.1  mix
osd.None           3/0  9m ago     -    <unmanaged>  quay.io/ceph/ceph:v15                     93146564743f
prometheus         1/1  9m ago     6h   count:1      quay.io/prometheus/prometheus:v2.18.1     de242295e225

ceph集群一般默認(rèn)會(huì)允許存在5個(gè)mon和2個(gè)mgr,將mon調(diào)整為3,執(zhí)行如下命令:

ceph orch apply mon 3
[ceph: root@ceph1 /]# ceph orch ls
NAME           RUNNING  REFRESHED  AGE  PLACEMENT                  IMAGE NAME                                IMAGE ID
alertmanager       1/1  35s ago    6h   count:1                    quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f
crash              3/3  39s ago    6h   *                          quay.io/ceph/ceph:v15                     93146564743f
grafana            1/1  35s ago    6h   count:1                    quay.io/ceph/ceph-grafana:6.7.4           557c83e11646
mgr                2/2  39s ago    6h   count:2                    quay.io/ceph/ceph:v15                     93146564743f
mon                3/3  39s ago    11m  ceph1;ceph2;ceph3;count:3  quay.io/ceph/ceph:v15                     93146564743f
node-exporter      1/3  39s ago    6h   *                          quay.io/prometheus/node-exporter:v0.18.1  mix
osd.None           3/0  35s ago    -    <unmanaged>                quay.io/ceph/ceph:v15                     93146564743f
prometheus         1/1  35s ago    6h   count:1                    quay.io/prometheus/prometheus:v2.18.1     de242295e225

6.3.3 部署mon和mgr

在特定的主機(jī)上部署mon

ceph orch apply mon --placement="3 ceph1 ceph2 ceph3"

在特定的主機(jī)上部署mgr

ceph orch apply mgr --placement="3 ceph1 ceph2 ceph3"

6.4 OSD部署

6.4.1 查看設(shè)備列表

執(zhí)行命令ceph orch device ls查看群集主機(jī)上的存儲(chǔ)設(shè)備清單可以顯示為:

[ceph: root@ceph1 /]# ceph orch device ls
Hostname  Path      Type  Serial  Size   Health   Ident  Fault  Available
ceph1     /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph1     /dev/sdc  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph1     /dev/sdd  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdc  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph2     /dev/sdd  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdc  hdd           21.4G  Unknown  N/A    N/A    Yes
ceph3     /dev/sdd  hdd           21.4G  Unknown  N/A    N/A    Yes

6.4.2 創(chuàng)建OSD

如果滿足以下所有條件,則認(rèn)為存儲(chǔ)設(shè)備可用:

  • 設(shè)備必須沒有分區(qū)。
  • 設(shè)備不得具有任何LVM狀態(tài)。
  • 不得安裝設(shè)備。
  • 該設(shè)備不得包含文件系統(tǒng)。
  • 該設(shè)備不得包含Ceph BlueStore OSD。
  • 設(shè)備必須大于5 GB。
    Ceph拒絕在不可用的設(shè)備上配置OSD。

Ceph使用任何可用和未使用的存儲(chǔ)設(shè)備,如下:

ceph orch apply osd --all-available-devices

在特定主機(jī)上的特定設(shè)備創(chuàng)建OSD,
ceph1主機(jī)添加為OSD

ceph orch daemon add osd ceph1:/dev/sdb
ceph orch daemon add osd ceph1:/dev/sdc
ceph orch daemon add osd ceph1:/dev/sdd

ceph2主機(jī)添加為OSD

ceph orch daemon add osd ceph2:/dev/sdb
ceph orch daemon add osd ceph2:/dev/sdc
ceph orch daemon add osd ceph2:/dev/sdd

ceph3主機(jī)添加為OSD

ceph orch daemon add osd ceph3:/dev/sdb
ceph orch daemon add osd ceph3:/dev/sdc
ceph orch daemon add osd ceph3:/dev/sdd

6.4.3 移除OSD

ceph orch osd rm 3

6.4.4啟停(osd)服務(wù)

ceph orch daemon start/stop/restart osd.3

6.5 RGW部署

6.5.1 創(chuàng)建一個(gè)領(lǐng)域

首先創(chuàng)建一個(gè)領(lǐng)域

radosgw-admin realm create --rgw-realm=radosgw --default

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# radosgw-admin realm create --rgw-realm=radosgw --default
{
    "id": "c4b75303-5ef6-4d82-a60c-efa4ceea2bc2",
    "name": "radosgw",
    "current_period": "d4267c01-b762-473b-bcf5-f278b1ea608a",
    "epoch": 1
}

6.5.2 創(chuàng)建區(qū)域組

創(chuàng)建區(qū)域組

radosgw-admin zonegroup create --rgw-zonegroup=default --master --default

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# radosgw-admin zonegroup create --rgw-zonegroup=default --master --default
{
    "id": "6a3dfbbf-3c09-4cc2-9644-1c17524ee4d1",
    "name": "default",
    "api_name": "default",
    "is_master": "true",
    "endpoints": [],
    "hostnames": [],
    "hostnames_s3website": [],
    "master_zone": "",
    "zones": [],
    "placement_targets": [],
    "default_placement": "",
    "realm_id": "c4b75303-5ef6-4d82-a60c-efa4ceea2bc2",
    "sync_policy": {
        "groups": []
    }
}

6.5.3 創(chuàng)建區(qū)域

創(chuàng)建區(qū)域

radosgw-admin zone create --rgw-zonegroup=default --rgw-zone=cn --master --default

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# radosgw-admin zone create --rgw-zonegroup=default --rgw-zone=cn --master --default
{
    "id": "4531574f-a84b-464f-962b-7a29aad20080",
    "name": "cn",
    "domain_root": "cn.rgw.meta:root",
    "control_pool": "cn.rgw.control",
    "gc_pool": "cn.rgw.log:gc",
    "lc_pool": "cn.rgw.log:lc",
    "log_pool": "cn.rgw.log",
    "intent_log_pool": "cn.rgw.log:intent",
    "usage_log_pool": "cn.rgw.log:usage",
    "roles_pool": "cn.rgw.meta:roles",
    "reshard_pool": "cn.rgw.log:reshard",
    "user_keys_pool": "cn.rgw.meta:users.keys",
    "user_email_pool": "cn.rgw.meta:users.email",
    "user_swift_pool": "cn.rgw.meta:users.swift",
    "user_uid_pool": "cn.rgw.meta:users.uid",
    "otp_pool": "cn.rgw.otp",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "cn.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "cn.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "cn.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "realm_id": "c4b75303-5ef6-4d82-a60c-efa4ceea2bc2"
}

6.5.4 部署radosgw

為特定領(lǐng)域和區(qū)域部署radosgw守護(hù)程序

ceph orch apply rgw radosgw cn --placement="3 ceph1 ceph2 ceph3"

6.5.5 驗(yàn)證各節(jié)點(diǎn)是否啟動(dòng)rgw容器

ceph orch ps --daemon-type rgw

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# ceph orch ps --daemon-type rgw
NAME                         HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME             IMAGE ID      CONTAINER ID
rgw.radosgw.cn.ceph1.jzxfhf  ceph1  running (48s)  38s ago    48s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  f2b23eac68b0
rgw.radosgw.cn.ceph2.hdbbtx  ceph2  running (55s)  42s ago    55s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  1a96776d7b0f
rgw.radosgw.cn.ceph3.arlfch  ceph3  running (52s)  41s ago    52s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  ab23d7f7a46c

6.6 MDS部署

Ceph 文件系統(tǒng),名為 CephFS,兼容 POSIX,建立在 Ceph 對(duì)象存儲(chǔ)RADOS之上,相關(guān)資料參考cephfs

6.6.1 創(chuàng)建數(shù)據(jù)存儲(chǔ)池

創(chuàng)建一個(gè)用于cephfs數(shù)據(jù)存儲(chǔ)的池

ceph osd pool create cephfs_data

6.6.2 創(chuàng)建元數(shù)據(jù)存儲(chǔ)池

創(chuàng)建一個(gè)用于cephfs元數(shù)據(jù)存儲(chǔ)的池

ceph osd pool create cephfs_metadata

6.6.3 創(chuàng)建文件系統(tǒng)

創(chuàng)建一個(gè)文件系統(tǒng),名為cephfs:

ceph fs new cephfs cephfs_metadata cephfs_data

6.6.4 查看文件系統(tǒng)

查看文件系統(tǒng)列表

ceph fs ls

設(shè)置cephfs最大mds服務(wù)數(shù)量為3

ceph fs set cephfs max_mds 3

若mds只有一個(gè),則需要設(shè)置mds為1,副本數(shù)若不夠,size設(shè)置為1

ceph fs set cephfs max_mds 1
ceph osd pool set cephfs_metadata size 1
ceph osd pool set cephfs_data size 1

6.6.5 部署3個(gè)mds服務(wù)

在特定的主機(jī)上部署mds服務(wù)

ceph orch apply mds cephfs --placement="3 ceph1 ceph2 ceph3"

6.6.6 查看mds服務(wù)是否部署成功

ceph orch ps --daemon-type mds

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# ceph orch ps --daemon-type mds
NAME                     HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME             IMAGE ID      CONTAINER ID
mds.cephfs.ceph1.gfvgda  ceph1  running (10s)  4s ago     10s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  7f216bfe781f
mds.cephfs.ceph2.ytkxqc  ceph2  running (15s)  6s ago     15s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  dec713c46919
mds.cephfs.ceph3.tszetu  ceph3  running (13s)  6s ago     13s  15.2.17  quay.io/ceph/ceph:v15  93146564743f  70ee41d1a81b

6.6.6 查看文件系統(tǒng)狀態(tài)

ceph mds stat

執(zhí)行結(jié)果如下:

cephfs:3 {0=cephfs.ceph2.ytkxqc=up:active,1=cephfs.ceph3.tszetu=up:active,2=cephfs.ceph1.gfvgda=up:active}

查下ceph狀態(tài)

[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     d20e3700-2d2f-11ee-9166-000c29aa07d2
    health: HEALTH_WARN
            insufficient standby MDS daemons available

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 24h)
    mgr: ceph1.kxhbab(active, since 45m), standbys: ceph2.rovomy, ceph3.aseinm
    mds: cephfs:3 {0=cephfs.ceph2.ytkxqc=up:active,1=cephfs.ceph3.tszetu=up:active,2=cephfs.ceph1.gfvgda=up:active}
    osd: 9 osds: 9 up (since 22h), 9 in (since 22h)
    rgw: 3 daemons active (radowgw.cn.ceph1.jzxfhf, radowgw.cn.ceph2.hdbbtx, radowgw.cn.ceph3.arlfch)

  task status:

  data:
    pools:   9 pools, 233 pgs
    objects: 269 objects, 14 KiB
    usage:   9.2 GiB used, 171 GiB / 180 GiB avail
    pgs:     233 active+clean

  io:
    client:   1.2 KiB/s rd, 1 op/s rd, 0 op/s wr

6.6.7 文件系統(tǒng)掛載

通過admin掛載文件系統(tǒng)

mount.ceph ceph1:6789,ceph2:6789,ceph3:6789:/ /mnt/cephfs -o name=admin

創(chuàng)建用戶 cephfs,用于客戶端訪問CephFs

ceph auth get-or-create client.cephfs mon 'allow r' mds 'allow r, allow rw path=/' osd 'allow rw pool=cephfs_data' -o ceph.client.cephfs.keyring

查看輸出的ceph.client.cephfs.keyring密鑰文件,或使用下面的命令查看密鑰:

ceph auth get-key client.cephfs

掛載cephfs到各節(jié)點(diǎn)本地目錄
在各個(gè)節(jié)點(diǎn)執(zhí)行:

mkdir /mnt/cephfs/
mount -t ceph ceph1:6789,ceph2:6789,ceph3:6789:/ /mnt/cephfs/ -o name=cephfs,secret=<cephfs訪問用戶的密鑰>

編輯各個(gè)節(jié)點(diǎn)的/etc/fstab文件,實(shí)現(xiàn)開機(jī)自動(dòng)掛載,添加以下內(nèi)容:

ceph1:6789,ceph2:6789,ceph3:6789:/    /mnt/cephfs    ceph    name=cephfs,secretfile=<cephfs訪問用戶的密鑰>,noatime,_netdev    0       2

6.7 部署 iSCSI 網(wǎng)關(guān)

6.7.1 創(chuàng)建池

ceph osd pool create iscsi_pool
ceph osd pool application enable iscsi_pool rbd

6.7.2 部署 iSCSI 網(wǎng)關(guān)

ceph orch apply iscsi iscsi_pool admin admin --placement="1 ceph1"

6.7.3 列出主機(jī)和進(jìn)程

ceph orch ps --daemon_type=iscsi

執(zhí)行結(jié)果如下:

[ceph: root@ceph1 /]# ceph orch ps --daemon_type=iscsi
NAME                      HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME             IMAGE ID      CONTAINER ID
iscsi.iscsi.ceph1.dszfku  ceph1  running (11s)  2s ago     12s  3.5      quay.io/ceph/ceph:v15  93146564743f  b4395ebd49b6

6.7.4 刪除iSCSI網(wǎng)關(guān)

查看列表

[ceph: root@ceph1 /]# ceph orch ls
NAME            RUNNING  REFRESHED  AGE  PLACEMENT                  IMAGE NAME                                IMAGE ID
alertmanager        1/1  4m ago     5h   count:1                    quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f
crash               3/3  12m ago    5h   *                          quay.io/ceph/ceph:v15                     93146564743f
grafana             1/1  4m ago     5h   count:1                    quay.io/ceph/ceph-grafana:6.7.4           557c83e11646
iscsi.iscsi         1/1  4m ago     4m   ceph1;count:1              quay.io/ceph/ceph:v15                     93146564743f
mds.cephfs          3/3  12m ago    4h   ceph1;ceph2;ceph3;count:3  quay.io/ceph/ceph:v15                     93146564743f
mgr                 3/3  12m ago    5h   ceph1;ceph2;ceph3;count:3  quay.io/ceph/ceph:v15                     93146564743f
mon                 3/3  12m ago    23h  ceph1;ceph2;ceph3;count:3  quay.io/ceph/ceph:v15                     93146564743f
node-exporter       3/3  12m ago    5h   *                          quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf
osd.None            9/0  12m ago    -    <unmanaged>                quay.io/ceph/ceph:v15                     93146564743f
prometheus          1/1  4m ago     5h   count:1                    quay.io/prometheus/prometheus:v2.18.1     de242295e225
rgw.radowgw.cn      3/3  12m ago    5h   ceph1;ceph2;ceph3;count:3  quay.io/ceph/ceph:v15                     93146564743f

刪除iscsi網(wǎng)關(guān)

ceph orch rm iscsi.iscsi

7. 問題記錄:

7.1 文件系統(tǒng)啟動(dòng)不了

執(zhí)行如下命令ceph mds stat和ceph health detail查看文件系統(tǒng)狀態(tài),結(jié)果如下:

HEALTH_ERR 1 filesystem is offline; 1 filesystem is online with fewer MDS than max_mds
[ERR] MDS_ALL_DOWN: 1 filesystem is offline
    fs cephfs is offline because no MDS is active for it.
[WRN] MDS_UP_LESS_THAN_MAX: 1 filesystem is online with fewer MDS than max_mds
    fs cephfs has 0 MDS online, but wants 1

通過ceph orch ls查看mds是否部署或者成功啟動(dòng),若沒有部署則通過以下命令部署:

ceph orch apply mds cephfs --placement="1 ceph1"

7.2 文件系統(tǒng)掛載報(bào)錯(cuò)mount error 2 = No such file or directory

查看命令ceph mds stat發(fā)現(xiàn)文件系統(tǒng)創(chuàng)建中

cephfs:1 {0=cephfs.ceph1.npyywh=up:creating}

通過命令ceph -s查看,結(jié)果如下:

  cluster:
    id:     dd6e5410-221f-11ee-b47c-000c29fd771a
    health: HEALTH_WARN
            1 MDSs report slow metadata IOs
            Reduced data availability: 64 pgs inactive
            Degraded data redundancy: 65 pgs undersized

  services:
    mon: 1 daemons, quorum ceph1 (age 15h)
    mgr: ceph1.vtottx(active, since 15h)
    mds: cephfs:1 {0=cephfs.ceph1.npyywh=up:creating}
    osd: 3 osds: 3 up (since 15h), 3 in (since 15h); 1 remapped pgs

  data:
    pools:   3 pools, 65 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 57 GiB / 60 GiB avail
    pgs:     98.462% pgs not active
             64 undersized+peered
             1  active+undersized+remapped

發(fā)現(xiàn)如下問題

1 MDSs report slow metadata IOs
Reduced data availability: 64 pgs inactive
Degraded data redundancy: 65 pgs undersized

由于單個(gè)節(jié)點(diǎn)集群,pg不足,滿足不了三副本的要求,需要設(shè)置mds為1,故將cephfs_metadata和cephfs_data的size設(shè)置為1,其他資源池一樣

ceph fs set cephfs max_mds 1
ceph osd pool set cephfs_metadata size 1
ceph osd pool set cephfs_data size 1

通過命令ceph mds stat 查詢結(jié)果如下:

cephfs:1 {0=cephfs.ceph1.npyywh=up:active}

狀態(tài)為active,可以正常掛載

7.3 /var/log/tallylog is either world writable or not a normal file

安裝軟件包c(diǎn)ephadm-15.2.17-0.el7.noarch.rpm,報(bào)如下錯(cuò)誤:

pam_tally2: /var/log/tallylog is either world writable or not a normal file
pam_tally2: Authentication error

執(zhí)行如下命令解決

chmod 600 /var/log/tallylog
pam_tally2 --user root --reset

7.4 RuntimeError: uid/gid not found

執(zhí)行cephadm shell,報(bào)如下錯(cuò)誤:

Inferring fsid c8fb20bc-247c-11ee-a39c-000c29aa07d2
Inferring config /var/lib/ceph/c8fb20bc-247c-11ee-a39c-000c29aa07d2/mon.node1/config
Using recent ceph image quay.io/ceph/ceph@<none>
Non-zero exit code 125 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint stat -e CONTAINER_IMAGE=quay.io/ceph/ceph@<none> -e NODE_NAME=node1 quay.io/ceph/ceph@<none> -c %u %g /var/lib/ceph
stat: stderr /usr/bin/docker: invalid reference format.
stat: stderr See '/usr/bin/docker run --help'.
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 6250, in <module>
    r = args.func()
  File "/usr/sbin/cephadm", line 1381, in _infer_fsid
    return func()
  File "/usr/sbin/cephadm", line 1412, in _infer_config
    return func()
  File "/usr/sbin/cephadm", line 1440, in _infer_image
    return func()
  File "/usr/sbin/cephadm", line 3573, in command_shell
    make_log_dir(args.fsid)
  File "/usr/sbin/cephadm", line 1538, in make_log_dir
    uid, gid = extract_uid_gid()
  File "/usr/sbin/cephadm", line 2155, in extract_uid_gid
    raise RuntimeError('uid/gid not found')
RuntimeError: uid/gid not found

經(jīng)過分析,確認(rèn) “cephadmin shell” 命令會(huì)啟動(dòng)一個(gè) ceph容器;因安裝在離線環(huán)境中,通過docker load的鏡像,丟失了 “RepoDigest” 信息,所有無法啟動(dòng)容器。
解決辦法是使用docker registry 創(chuàng)建本地倉(cāng)庫,然后推送鏡像到倉(cāng)庫,再拉取該鏡像來完善 “RepoDigest” 信息。(所有主機(jī)),同時(shí)主機(jī)名的解析也需要放到第一位,以免解析不到。

7.5 iptables: No chain/target/match by that name.

執(zhí)行docker restart registry時(shí)候報(bào)如下錯(cuò)誤

Error response from daemon: Cannot restart container registry: driver failed programming external connectivity on endpoint registry (dd9a9c15a451daa6abd2b85e840d7856c5c5f98c1fb1ae35897d3fbb28e2997c):  (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 4000 -j DNAT --to-destination 172.17.0.2:5000 ! -i docker0: iptables: No chain/target/match by that name.
 (exit status 1))

在麒麟系統(tǒng)里面使用了firewalld代替了iptables,在運(yùn)行容器前,firewalld是運(yùn)行的,iptables會(huì)被使用,后來關(guān)閉了firewalld導(dǎo)致無法找到iptables相關(guān)信息,解決辦法是重啟docker更新容器,執(zhí)行如下命令:

systemctl restart docker

7.6 /usr/bin/ceph: timeout after 60 seconds

執(zhí)行cephadm bootstrap
錯(cuò)誤如下:

cephadm bootstrap --mon-ip 172.25.0.141
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 65ae4594-2d2c-11ee-b17b-000c29aa07d2
Verifying IP 172.25.0.141 port 3300 ...
Verifying IP 172.25.0.141 port 6789 ...
Mon IP 172.25.0.141 is in CIDR network 172.25.0.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
/usr/bin/ceph: timeout after 60 seconds
Non-zero exit code -9 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 status
mon not available, waiting (1/10)...
/usr/bin/ceph: timeout after 60 seconds
Non-zero exit code -9 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 status
mon not available, waiting (2/10)...
/usr/bin/ceph: timeout after 60 seconds
Non-zero exit code -9 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 status
mon not available, waiting (3/10)...
/usr/bin/ceph: timeout after 60 seconds
Non-zero exit code -9 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 status
mon not available, waiting (4/10)...
/usr/bin/ceph: timeout after 60 seconds
Non-zero exit code -9 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 status
mon not available, waiting (5/10)...

然后手動(dòng)執(zhí)行

docker run -it --ipc=host --net=host -e CONTAINER_IMAGE=quay.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/lib/ceph/65ae4594-2d2c-11ee-b17b-000c29aa07d2/mon.ceph1:/var/lib/ceph/mon/ceph-ceph1:z -v /tmp/ceph-tmpuylyqrau:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp85t1j6sg:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v15 bash

bash進(jìn)去后執(zhí)行ceph -s,發(fā)現(xiàn)如下錯(cuò)誤

[root@ceph1 /]# ceph -s
2023-07-28T10:01:34.083+0000 7f0dc4f41700  0 monclient(hunting): authenticate timed out after 300
2023-07-28T10:06:34.087+0000 7f0dc4f41700  0 monclient(hunting): authenticate timed out after 300
2023-07-28T10:11:34.087+0000 7f0dc4f41700  0 monclient(hunting): authenticate timed out after 300

查閱相關(guān)資料,發(fā)現(xiàn)跟DNS解析有關(guān),修改/etc/resolv.conf,加入一個(gè)DNS SERVER
vim /etc/resolv.conf

nameserver 223.5.5.5

問題得到解決。

7.7 ERROR: Cannot infer an fsid, one must be specified

執(zhí)行ceph -s,報(bào)如下錯(cuò)誤:

ERROR: Cannot infer an fsid, one must be specified: ['65ae4594-2d2c-11ee-b17b-000c29aa07d2', '70eb8f7c-2d2f-11ee-                                                                                                                                         8265-000c29aa07d2', 'd20e3700-2d2f-11ee-9166-000c29aa07d2']

由于多次部署失敗導(dǎo)致的,將失敗的fsid刪除即可,操作如下:

cephadm  rm-cluster --fsid 65ae4594-2d2c-11ee-b17b-000c29aa07d2 --force
cephadm  rm-cluster --fsid 70eb8f7c-2d2f-11ee-8265-000c29aa07d2 --force

7.8 2 failed cephadm daemon(s),daemon node-exporter.ceph2 on ceph2 is in error state

執(zhí)行ceph -s,報(bào)信息2 failed cephadm daemon(s),如下

[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     d20e3700-2d2f-11ee-9166-000c29aa07d2
    health: HEALTH_WARN
            2 failed cephadm daemon(s)
            Degraded data redundancy: 1 pg undersized

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 9m)
    mgr: ceph1.kxhbab(active, since 6h), standbys: ceph3.aseinm
    osd: 3 osds: 3 up (since 42m), 3 in (since 42m); 1 remapped pgs

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 57 GiB / 60 GiB avail
    pgs:     1 active+undersized+remapped

通過ceph health detail查看

HEALTH_WARN 2 failed cephadm daemon(s); Degraded data redundancy: 1 pg undersized
[WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s)
    daemon node-exporter.ceph2 on ceph2 is in error state
    daemon node-exporter.ceph3 on ceph3 is in error state
[WRN] PG_DEGRADED: Degraded data redundancy: 1 pg undersized
    pg 1.0 is stuck undersized for 44m, current state active+undersized+remapped, last acting [1,0]

發(fā)現(xiàn)是ceph2、ceph3上node-exporter有問題,再次通過ceph orch ps查看,如下

[ceph: root@ceph1 /]# ceph orch ps
NAME                 HOST   STATUS         REFRESHED  AGE  VERSION    IMAGE NAME                                IMAGE ID      CONTAINER ID
alertmanager.ceph1   ceph1  running (17m)  6m ago     6h   0.20.0     quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f  f640d60309a8
crash.ceph1          ceph1  running (6h)   6m ago     6h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  167923df16d6
crash.ceph2          ceph2  running (45m)  6m ago     45m  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  36930ffad980
crash.ceph3          ceph3  running (7h)   6m ago     7h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  3cf33326be8f
grafana.ceph1        ceph1  running (66m)  6m ago     6h   6.7.4      quay.io/ceph/ceph-grafana:6.7.4           557c83e11646  a9f1cd6dd382
mgr.ceph1.kxhbab     ceph1  running (6h)   6m ago     7h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  c738e894f955
mgr.ceph3.aseinm     ceph3  running (7h)   6m ago     7h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  2657cef61946
mon.ceph1            ceph1  running (6h)   6m ago     7h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  ac8d2bf766d9
mon.ceph2            ceph2  running (45m)  6m ago     45m  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  30fc64339e92
mon.ceph3            ceph3  running (7h)   6m ago     7h   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  b98335e1b0e1
node-exporter.ceph1  ceph1  running (66m)  6m ago     6h   0.18.1     quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7bb7cc89e4bf
node-exporter.ceph2  ceph2  error          6m ago     7h   <unknown>  quay.io/prometheus/node-exporter:v0.18.1  <unknown>     <unknown>
node-exporter.ceph3  ceph3  error          6m ago     7h   <unknown>  quay.io/prometheus/node-exporter:v0.18.1  <unknown>     <unknown>
osd.0                ceph1  running (45m)  6m ago     45m  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  56418e7d64a2
osd.1                ceph1  running (45m)  6m ago     45m  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  04be93b74209
osd.2                ceph1  running (45m)  6m ago     45m  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  15ca4a2a1acc
prometheus.ceph1     ceph1  running (17m)  6m ago     6h   2.18.1     quay.io/prometheus/prometheus:v2.18.1     de242295e225  8dfaf3f5b9c2

ceph2、ceph3上node-exporter狀態(tài)為error,將該服務(wù)重啟

ceph orch daemon restart node-exporter.ceph2
ceph orch daemon restart node-exporter.ceph3

重啟之后正常,如下:

[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     d20e3700-2d2f-11ee-9166-000c29aa07d2
    health: HEALTH_WARN
            Degraded data redundancy: 1 pg undersized

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 22m)
    mgr: ceph1.kxhbab(active, since 6h), standbys: ceph3.aseinm
    osd: 3 osds: 3 up (since 56m), 3 in (since 56m); 1 remapped pgs

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 57 GiB / 60 GiB avail
    pgs:     1 active+undersized+remapped

7.8 1 failed cephadm daemon(s),daemon node-exporter.ceph2 on ceph2 is in error state

執(zhí)行ceph -s,報(bào)信息1 failed cephadm daemon(s),如下

[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     d20e3700-2d2f-11ee-9166-000c29aa07d2
    health: HEALTH_WARN
            1 failed cephadm daemon(s)

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 49m)
    mgr: ceph1.kxhbab(active, since 6h), standbys: ceph3.aseinm
    osd: 9 osds: 8 up (since 116s), 8 in (since 116s)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   8.0 GiB used, 152 GiB / 160 GiB avail
    pgs:     1 active+clean


通過ceph health detail查看

[ceph: root@ceph1 /]# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
    daemon osd.3 on ceph2 is in unknown state

發(fā)現(xiàn)是ceph2上osd.3有問題,再次通過ceph orch ps查看,如下

[ceph: root@ceph1 /]# ceph orch ps
NAME                 HOST   STATUS          REFRESHED  AGE   VERSION    IMAGE NAME                                IMAGE ID      CONTAINER ID
alertmanager.ceph1   ceph1  running (53m)   2m ago     7h    0.20.0     quay.io/prometheus/alertmanager:v0.20.0   0881eb8f169f  f640d60309a8
crash.ceph1          ceph1  running (6h)    2m ago     7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  167923df16d6
crash.ceph2          ceph2  running (82m)   77s ago    82m   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  36930ffad980
crash.ceph3          ceph3  running (7h)    31s ago    7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  3cf33326be8f
grafana.ceph1        ceph1  running (103m)  2m ago     7h    6.7.4      quay.io/ceph/ceph-grafana:6.7.4           557c83e11646  a9f1cd6dd382
mgr.ceph1.kxhbab     ceph1  running (6h)    2m ago     7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  c738e894f955
mgr.ceph3.aseinm     ceph3  running (7h)    31s ago    7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  2657cef61946
mon.ceph1            ceph1  running (6h)    2m ago     7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  ac8d2bf766d9
mon.ceph2            ceph2  running (82m)   77s ago    82m   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  30fc64339e92
mon.ceph3            ceph3  running (7h)    31s ago    7h    15.2.17    quay.io/ceph/ceph:v15                     93146564743f  b98335e1b0e1
node-exporter.ceph1  ceph1  running (103m)  2m ago     7h    0.18.1     quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7bb7cc89e4bf
node-exporter.ceph2  ceph2  running (34m)   77s ago    8h    0.18.1     quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  9cde6dd53b22
node-exporter.ceph3  ceph3  running (33m)   31s ago    8h    0.18.1     quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  106d248979bc
osd.0                ceph1  running (82m)   2m ago     82m   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  56418e7d64a2
osd.1                ceph1  running (81m)   2m ago     81m   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  04be93b74209
osd.2                ceph1  running (81m)   2m ago     81m   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  15ca4a2a1acc
osd.3                ceph2  unknown         77s ago    2m    <unknown>  quay.io/ceph/ceph:v15                     <unknown>     <unknown>
osd.4                ceph2  running (98s)   77s ago    100s  15.2.17    quay.io/ceph/ceph:v15                     93146564743f  02800b38df89
osd.5                ceph2  running (82s)   77s ago    84s   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  05ce4a9c9588
osd.6                ceph3  running (60s)   31s ago    61s   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  0d22f79c41ab
osd.7                ceph3  running (45s)   31s ago    47s   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  a80a852550e3
osd.8                ceph3  running (32s)   31s ago    33s   15.2.17    quay.io/ceph/ceph:v15                     93146564743f  51ebec72bb3f
prometheus.ceph1     ceph1  running (53m)   2m ago     7h    2.18.1     quay.io/prometheus/prometheus:v2.18.1     de242295e225  8dfaf3f5b9c2

ceph2上osd.3狀態(tài)為unknown

ceph osd tree查看下osd狀態(tài)

[ceph: root@ceph1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         0.15588  root default
-3         0.05846      host ceph1
 0    hdd  0.01949          osd.0       up   1.00000  1.00000
 1    hdd  0.01949          osd.1       up   1.00000  1.00000
 2    hdd  0.01949          osd.2       up   1.00000  1.00000
-5         0.03897      host ceph2
 4    hdd  0.01949          osd.4       up   1.00000  1.00000
 5    hdd  0.01949          osd.5       up   1.00000  1.00000
-7         0.05846      host ceph3
 6    hdd  0.01949          osd.6       up   1.00000  1.00000
 7    hdd  0.01949          osd.7       up   1.00000  1.00000
 8    hdd  0.01949          osd.8       up   1.00000  1.00000
 3               0  osd.3             down         0  1.00000

發(fā)現(xiàn)osd.3的狀態(tài)為down,針對(duì)服務(wù)做一次重啟

ceph orch daemon restart osd.3

等待一端時(shí)間后發(fā)現(xiàn)ceph orch ps獲取的osd.3的狀態(tài)為error

采取刪除重新部署后,登錄到ceph2,將磁盤重新初始化,用于滿足OSD部署,如下

dmsetup rm ceph--89a94f3b--e5ef--4ec9--b828--d86ea84d6540-osd--block--e4a8e9c8--20e6--4847--8176--510533616844
vgremove ceph-89a94f3b-e5ef-4ec9-b828-d86ea84d6540
pvremove /dev/sdb
mkfs.xfs -f /dev/sdb

在ceph1上刪除重新添加

ceph osd rm 3
ceph orch daemon rm osd.3 --force
ceph orch daemon add osd ceph2:/dev/sdb

重新部署后正常,估計(jì)一開始磁盤遇到些問題,如下:

[ceph: root@ceph1 /]# ceph -s
  cluster:
    id:     d20e3700-2d2f-11ee-9166-000c29aa07d2
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 16h)
    mgr: ceph1.kxhbab(active, since 22h), standbys: ceph3.aseinm
    osd: 9 osds: 9 up (since 15h), 9 in (since 15h)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   9.1 GiB used, 171 GiB / 180 GiB avail
    pgs:     1 active+clean


8. 參考文獻(xiàn):

https://docs.ceph.com/en/latest/cephadm/
https://docs.ceph.com/en/latest/cephfs/
https://blog.csdn.net/DoloresOOO/article/details/106855093
https://blog.csdn.net/XU_sun_/article/details/119909860
https://blog.csdn.net/networken/article/details/106870859
https://zhuanlan.zhihu.com/p/598832268
https://www.jianshu.com/p/b2aab379d7ec
https://blog.csdn.net/JineD/article/details/113886368
http://dbaselife.com/doc/752/
http://www.chenlianfu.com/?p=3388
http://mirrors.163.com/ceph/rpm-octopus/el7/noarch/
https://blog.csdn.net/qq_27979109/article/details/120345676#3mondocker_runcephconf_718
https://cloud-atlas.readthedocs.io/zh_CN/latest/ceph/deploy/install_mobile_cloud_ceph/debug_ceph_authenticate_time_out.html
https://access.redhat.com/documentation/zh-cn/red_hat_ceph_storage/5/html/operations_guide/introduction-to-the-ceph-orchestrator文章來源地址http://www.zghlxwxcb.cn/news/detail-624176.html

到了這里,關(guān)于Kylin v10基于cephadm工具離線部署ceph分布式存儲(chǔ)的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!

本文來自互聯(lián)網(wǎng)用戶投稿,該文觀點(diǎn)僅代表作者本人,不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù),不擁有所有權(quán),不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載,請(qǐng)注明出處: 如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符,請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋,一經(jīng)查實(shí),立即刪除!

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

相關(guān)文章

  • Ubuntu20.04使用cephadm部署ceph集群

    Ubuntu20.04使用cephadm部署ceph集群

    Cephadm通過在單個(gè)主機(jī)上創(chuàng)建一個(gè)Ceph單機(jī)集群,然后向集群中添加主機(jī)以擴(kuò)展集群,進(jìn)而部署其他服務(wù)。 VMware安裝Ubuntu20.04并使用Xshell連接虛擬機(jī):https://blog.csdn.net/gengduc/article/details/134889416 Python3 Systemd Podman或Docker 時(shí)間同步chrony或NTP LVM2 主機(jī)名hostname 硬盤設(shè)備 ip地址 Ceph服務(wù)

    2024年02月05日
    瀏覽(24)
  • cephadm離線部署及curushmap

    cephadm離線部署及curushmap

    系統(tǒng)盤raid1或raid5,數(shù)據(jù)盤直通或raid0,電口為集群網(wǎng)絡(luò),用作osd守護(hù)進(jìn)程之間的復(fù)制、恢復(fù)、心跳,光口為存儲(chǔ)網(wǎng)絡(luò),用作對(duì)外存儲(chǔ)流量 下載cephadm:https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm yum install --downloadonly docker-ce --downloaddir=/root/offline-pkg/docker-ce yum install --downloado

    2024年01月17日
    瀏覽(17)
  • kylin v10 升級(jí) openssl、openssh

    一、查看當(dāng)前安裝的版本 注意:不要卸載舊版本,會(huì)出依賴方面的問題! 二、下載 三、編譯 編譯安裝需較長(zhǎng)時(shí)間!!! 說明:一定要加上shared 參數(shù),要不在安裝openssh的時(shí)候提示無法找到路徑! 四、建立軟鏈接 5.寫入openssl庫文件的搜索路徑 ?如果出現(xiàn)錯(cuò)誤?bash: /etc/ld.so.conf:

    2024年04月11日
    瀏覽(23)
  • kylin v10 配置 rime 輸入法

    刪除自帶的 fcitx 框架 清理緩存和殘留數(shù)據(jù) 下載 ibus 框架 下載 rime 輸入法 重啟電腦 找到 ibus 首選項(xiàng),選擇 輸入法 ————》 添加 ————》 中文 ——————》 rime . 網(wǎng)上選擇 一個(gè) rime 方案(霧凇 rime) https://github.com/iDvel/rime-ice 將文件全部復(fù)制到這個(gè)rime 文件夾下

    2024年02月20日
    瀏覽(48)
  • k8s學(xué)習(xí)(三十四)飛騰2000+麒麟V10離線部署k8s

    k8s學(xué)習(xí)(三十四)飛騰2000+麒麟V10離線部署k8s

    需要在飛騰芯片和麒麟V10操作系統(tǒng)下安裝k8s,目前在飛騰2000下安裝成功,飛騰2500還沒有成功,現(xiàn)介紹飛騰2000下的安裝流程。 準(zhǔn)備多臺(tái)機(jī)器,其中一臺(tái)作為K8S主節(jié)點(diǎn),其他作為從節(jié)點(diǎn)。 這里只準(zhǔn)備了2臺(tái)機(jī)器,一臺(tái)作為主節(jié)點(diǎn),一臺(tái)作為從節(jié)點(diǎn)。 k8s版本為1.20.2 功能 IP k8s主節(jié)

    2024年02月01日
    瀏覽(26)
  • 【Docker】Kylin V10 下 MySQL 容器內(nèi)存占用異常的解決方法

    【Docker】Kylin V10 下 MySQL 容器內(nèi)存占用異常的解決方法

    以下內(nèi)容均來自個(gè)人筆記并重新梳理,如有錯(cuò)誤歡迎指正!如果對(duì)您有幫助,煩請(qǐng)點(diǎn)贊、關(guān)注、轉(zhuǎn)發(fā)!歡迎掃碼關(guān)注個(gè)人公眾號(hào)! ?公眾號(hào)原文鏈接:Kylin V10 下 MySQL 容器內(nèi)存占用異常的解決方法 背景介紹 筆者在文章《MySQL 源碼構(gòu)建 Docker 鏡像(基于 ARM 64 架構(gòu))》中曾提到

    2024年02月22日
    瀏覽(24)
  • 麒麟 Kylin V10 一鍵安裝 Oracle 11GR2 單機(jī) ASM(231017)

    麒麟 Kylin V10 一鍵安裝 Oracle 11GR2 單機(jī) ASM(231017)

    Oracle 一鍵安裝腳本,演示麒麟 Kylin V10 一鍵安裝 Oracle 11GR2 單機(jī) ASM(231017)過程(全程無需人工干預(yù)):(腳本包括 ORALCE PSU/OJVM 等補(bǔ)丁自動(dòng)安裝 ) ?? 腳本下載地址 :Shell腳本安裝Oracle數(shù)據(jù)庫 腳本第三代支持 N 節(jié)點(diǎn)一鍵安裝,不限制節(jié)點(diǎn)數(shù)! 1、安裝好操作系統(tǒng),建議安

    2024年04月26日
    瀏覽(31)
  • 【Linux】銀河麒麟V10 ARM架構(gòu)_安裝JDK8-kylinV10(Kylin Linux Advanced Server V10 )操作系統(tǒng)(CentOS8)

    【Linux】銀河麒麟V10 ARM架構(gòu)_安裝JDK8-kylinV10(Kylin Linux Advanced Server V10 )操作系統(tǒng)(CentOS8)

    ?? 個(gè)人主頁——??開著拖拉機(jī)回家_大數(shù)據(jù)運(yùn)維-CSDN博客 ????? ???? 希望本文能夠給您帶來一定的幫助??文章粗淺,敬請(qǐng)批評(píng)指正!???? ???????????????? ?????????????? ???????????????????????? ?????????????? 感謝點(diǎn)贊和關(guān)注 ,每天

    2024年02月05日
    瀏覽(38)
  • 在銀河麒麟V10SP1forARM版本上離線編譯libreoffice-7.1.8.1和kkfileview的部署,實(shí)現(xiàn)在線查看功能

    在銀河麒麟V10SP1forARM版本上離線編譯libreoffice-7.1.8.1和kkfileview的部署,實(shí)現(xiàn)在線查看功能

    編譯libreoffice-7.1.8.1和配置kkfileview 需要以下文件, kkFileView-4.0.0-可以使用.tar.gz文件已經(jīng)將配置文件增加進(jìn)去 libreoffice-7.1.8.1.tar.xz 是libreoffice源文件,需要對(duì)其進(jìn)行編譯 libreoffice-external-tar.tar.gz 是libreoffice程序需要的相關(guān)組件 libreoffice-yum-new.tar.gz 編譯器環(huán)境用的組件 autogen .i

    2024年02月04日
    瀏覽(27)
  • 麒麟操作系統(tǒng)Kylin V10 安裝達(dá)夢(mèng)數(shù)據(jù)庫DM8 客戶端遠(yuǎn)程調(diào)試

    麒麟操作系統(tǒng)Kylin V10 安裝達(dá)夢(mèng)數(shù)據(jù)庫DM8 客戶端遠(yuǎn)程調(diào)試

    一、環(huán)境準(zhǔn)備 軟件名稱 版本 vmware workstation 16.1.2 build-17966106 銀河麒麟 Kylin-Server-10-SP2-x86-Release-Build09-20210524.iso 達(dá)夢(mèng)DM8 DMInstall.bin 銀河麒麟 server v10 x86架構(gòu)安裝iso鏡像。 Kylin-Server-10-SP2-x86-Release-Build09-20210524.iso 二、DM8數(shù)據(jù)庫安裝 以下操作可以使用遠(yuǎn)程工具執(zhí)行命令 1. 啟動(dòng)網(wǎng)絡(luò)

    2024年02月04日
    瀏覽(36)

覺得文章有用就打賞一下文章作者

支付寶掃一掃打賞

博客贊助

微信掃一掃打賞

請(qǐng)作者喝杯咖啡吧~博客贊助

支付寶掃一掃領(lǐng)取紅包,優(yōu)惠每天領(lǐng)

二維碼1

領(lǐng)取紅包

二維碼2

領(lǐng)紅包