#### 环境介绍
```sh
kubernetes 1.18.20
rook-ceph 1.6.10
ceph version 15.2.13
```
`注意:kubernetes的节点要准备未使用的磁盘,供ceph集群使用。kubernetes版本要匹配对应的rook-ceph版本`
#### 安装rook
##### 下载rook
```sh
$ git clone --single-branch --branch v1.6.10 https://github.com/rook/rook.git #当前kubernetes 1.18.20不支持1.7.x
```
##### 安装rook
```sh
$ cd cluster/examples/kubernetes/ceph
$ kubectl create -f crds.yaml -f common.yaml -f operator.yaml
# verify the rook-ceph-operator is in the `Running` state before proceeding
$ kubectl -n rook-ceph get pod
```
#### 安装ceph集群
```sh
$ kubectl create -f cluster.yaml
```
查看集群状态
```sh
$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-6tfqr 3/3 Running 0 6m45s
csi-cephfsplugin-nldks 3/3 Running 0 6m45s
csi-cephfsplugin-provisioner-6c59b5b7d9-lnddw 6/6 Running 0 6m44s
csi-cephfsplugin-provisioner-6c59b5b7d9-lvdtc 6/6 Running 0 6m44s
csi-cephfsplugin-zkt2x 3/3 Running 0 6m45s
csi-rbdplugin-2kwhg 3/3 Running 0 6m46s
csi-rbdplugin-7j9hw 3/3 Running 0 6m46s
csi-rbdplugin-provisioner-6d455d5677-6qcn5 6/6 Running 0 6m46s
csi-rbdplugin-provisioner-6d455d5677-8xp6r 6/6 Running 0 6m46s
csi-rbdplugin-qxv4m 3/3 Running 0 6m46s
rook-ceph-crashcollector-k8s-master1-7bf874dc98-v9bj9 1/1 Running 0 2m8s
rook-ceph-crashcollector-k8s-master2-6698989df9-7hsfr 1/1 Running 0 2m16s
rook-ceph-crashcollector-k8s-node1-8578585676-9tf5w 1/1 Running 0 2m16s
rook-ceph-mgr-a-5c4759947f-47kbk 1/1 Running 0 2m19s
rook-ceph-mon-a-647877db88-629bt 1/1 Running 0 6m56s
rook-ceph-mon-b-c44f7978f-p2fq6 1/1 Running 0 5m11s
rook-ceph-mon-c-588b48b74c-bbkn8 1/1 Running 0 3m42s
rook-ceph-operator-64845bd768-55dkc 1/1 Running 0 13m
rook-ceph-osd-0-64f9fc6c65-m7877 1/1 Running 0 2m8s
rook-ceph-osd-1-584bf986c7-xj75p 1/1 Running 0 2m8s
rook-ceph-osd-2-d59cdbd7f-xcwgf 1/1 Running 0 2m8s
rook-ceph-osd-prepare-k8s-master1-vstkq 0/1 Completed 0 2m16s
rook-ceph-osd-prepare-k8s-master2-zdjk2 0/1 Completed 0 2m16s
rook-ceph-osd-prepare-k8s-node1-wmzj9 0/1 Completed 0 2m16s
```
#### 卸载ceph集群
https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md
```sh
kubectl delete -f cluster.yaml
```
清楚rook目录
```sh
$ rm /var/lib/rook/* -rf
```
#### 安装Rook toolbox
```SH
$ kubectl create -f toolbox.yaml
$ kubectl -n rook-ceph rollout status deploy/rook-ceph-tools
$ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
```
#### 安装dashboard
```sh
$ kubectl -n rook-ceph get service
```
获取登录密码
```sh
$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
```
#### 配置外网访问
```sh
$ kubectl create -f dashboard-external-https.yaml
```
```sh
$ kubectl -n rook-ceph get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.101.238.49 <none> 9283/TCP 47m
rook-ceph-mgr-dashboard ClusterIP 10.107.124.177 <none> 8443/TCP 47m
rook-ceph-mgr-dashboard-external-https NodePort 10.102.110.16 <none> 8443:30870/TCP 14s
```
访问:https://172.16.100.100:30870/
#### rook提供RBD服务
创建pool和StorageClass
```sh
$ kubectl create -f cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
```
```sh
$ kubectl apply -f rook/cluster/examples/kubernetes/mysql.yaml
$ kubectl apply -f rook/cluster/examples/kubernetes/wordpress.yaml
```
```sh
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cephfs-test-pvc-1 Bound pvc-571ae252-b080-4a67-8f3d-40bda1304fe3 500Mi RWX cephfs 2d23h
mysql-pv-claim Bound pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c 20Gi RWO rook-ceph-block 15m
test-claim Bound pvc-c51eef14-1454-4ddf-99e1-db483dec42c6 1Mi RWX managed-nfs-storage 3d5h
wp-pv-claim Bound pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1 20Gi RWO rook-ceph-block 13m
```
查看插件日志
```sh
$ kubectl logs csi-rbdplugin-provisioner-85c58fcfb4-nfnvd -n rook-ceph -c csi-provisioner
I0925 07:13:27.761137 1 csi-provisioner.go:138] Version: v2.2.2
I0925 07:13:27.761189 1 csi-provisioner.go:161] Building kube configs for running in cluster...
W0925 07:13:37.769467 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:13:47.769515 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:13:57.769149 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:07.769026 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:17.768966 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:27.768706 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:37.769526 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:47.769001 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0925 07:14:57.769711 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
I0925 07:15:01.978682 1 common.go:111] Probing CSI driver for readiness
I0925 07:15:01.980980 1 csi-provisioner.go:284] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments
I0925 07:15:01.982474 1 leaderelection.go:243] attempting to acquire leader lease rook-ceph/rook-ceph-rbd-csi-ceph-com...
I0925 07:15:02.001351 1 leaderelection.go:253] successfully acquired lease rook-ceph/rook-ceph-rbd-csi-ceph-com
I0925 07:15:02.102184 1 controller.go:835] Starting provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-85c58fcfb4-nfnvd_c5346baf-02b2-4983-9405-45bbc31c216a!
I0925 07:15:02.102243 1 volume_store.go:97] Starting save volume queue
I0925 07:15:02.102244 1 clone_controller.go:66] Starting CloningProtection controller
I0925 07:15:02.102299 1 clone_controller.go:84] Started CloningProtection controller
I0925 07:15:02.202672 1 controller.go:884] Started provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-85c58fcfb4-nfnvd_c5346baf-02b2-4983-9405-45bbc31c216a!
I0925 07:46:47.261147 1 controller.go:1332] provision "default/mysql-pv-claim" class "rook-ceph-block": started
I0925 07:46:47.261389 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"mysql-pv-claim", UID:"6f6d70ba-6961-42b5-aa9d-db13c1aeec7c", APIVersion:"v1", ResourceVersion:"2798560", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/mysql-pv-claim"
I0925 07:46:49.310382 1 controller.go:1439] provision "default/mysql-pv-claim" class "rook-ceph-block": volume "pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c" provisioned
I0925 07:46:49.310419 1 controller.go:1456] provision "default/mysql-pv-claim" class "rook-ceph-block": succeeded
I0925 07:46:49.319422 1 controller.go:1332] provision "default/mysql-pv-claim" class "rook-ceph-block": started
I0925 07:46:49.319446 1 controller.go:1341] provision "default/mysql-pv-claim" class "rook-ceph-block": persistentvolume "pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c" already exists, skipping
I0925 07:46:49.319615 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"mysql-pv-claim", UID:"6f6d70ba-6961-42b5-aa9d-db13c1aeec7c", APIVersion:"v1", ResourceVersion:"2798560", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c
I0925 07:48:52.230183 1 controller.go:1332] provision "default/wp-pv-claim" class "rook-ceph-block": started
I0925 07:48:52.230656 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"wp-pv-claim", UID:"d5eb4142-532f-4704-9d15-94ffc7f35dd1", APIVersion:"v1", ResourceVersion:"2799324", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/wp-pv-claim"
I0925 07:48:52.289747 1 controller.go:1439] provision "default/wp-pv-claim" class "rook-ceph-block": volume "pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1" provisioned
I0925 07:48:52.289790 1 controller.go:1456] provision "default/wp-pv-claim" class "rook-ceph-block": succeeded
I0925 07:48:52.295778 1 controller.go:1332] provision "default/wp-pv-claim" class "rook-ceph-block": started
I0925 07:48:52.295798 1 controller.go:1341] provision "default/wp-pv-claim" class "rook-ceph-block": persistentvolume "pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1" already exists, skipping
I0925 07:48:52.295844 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"wp-pv-claim", UID:"d5eb4142-532f-4704-9d15-94ffc7f35dd1", APIVersion:"v1", ResourceVersion:"2799324", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1
```
#### rook提供cephFS服务
```sh
$ kubectl create -f filesystem.yaml
```
```sh
$ kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-6dd59747f5-2qjtk 1/1 Running 0 38s
rook-ceph-mds-myfs-b-56488764f9-n6fzf 1/1 Running 0 37s
```
```tex
$ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[root@rook-ceph-tools-5b85cb4766-8wj5c /]# ceph status
cluster:
id: 1c545199-7a68-46df-9b21-2a801c1ad1af
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 50m)
mgr: a(active, since 70m)
mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay #mds元数据服务
osd: 3 osds: 3 up (since 71m), 3 in (since 71m)
data:
pools: 4 pools, 97 pgs
objects: 132 objects, 340 MiB
usage: 4.0 GiB used, 296 GiB / 300 GiB avail
pgs: 97 active+clean
io:
client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
```
创建存储类
```sh
$ cat >storageclass-cephfs.yaml <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
# clusterID is the namespace where operator is deployed.
clusterID: rook-ceph
# CephFS filesystem name into which the volume shall be created
fsName: myfs
# Ceph pool into which the volume shall be created
# Required for provisionVolume: "true"
pool: myfs-data0
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
EOF
```
```sh
$ kubectl create -f storageclass-cephfs.yaml
```
测试cephfs
```sh
$ cat >test-cephfs-pvc.yaml <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
EOF
```
```sh
$ kubectl apply -f test-cephfs-pvc.yaml
```
```sh
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cephfs-pvc Bound pvc-4e200767-9ed9-4b47-8505-3081b8de6893 1Gi RWX rook-cephfs 8s
```
#### 问题
`1.[WRN] MON_CLOCK_SKEW: clock skew detected on mon.b ?`
```tex
$ kubectl -n rook-ceph edit ConfigMap rook-config-override -o yaml
config: |
[global]
mon clock drift allowed = 0.5
$ kubectl -n rook-ceph delete pod $(kubectl -n rook-ceph get pods -o custom-columns=NAME:.metadata.name --no-headers| grep mon)
$ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[root@rook-ceph-tools-75cf595688-hrtmv /]# ceph -s
cluster:
id: 023baa6d-8ec1-4ada-bd17-219136f656b4
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 2m)
mgr: a(active, since 24m)
osd: 3 osds: 3 up (since 25m), 3 in (since 25m)
data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 B
usage: 19 MiB used, 300 GiB / 300 GiB avail
pgs: 128 active+clean
```
`2.type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-ceph-block": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-34167041-eb16-4f03-b01e-6b915089488d already exists`
删除下面的pod
```sh
kubectl get po -n rook-ceph | grep csi-cephfsplugin-provisioner-
kubectl get po -n rook-ceph | grep csi-rbdplugin-provisioner-
kubectl get po -n rook-ceph | grep csi-rbdplugin-
kubectl get po -n rook-ceph | grep csi-cephfsplugin-
```
`3.创建pvc一直pengding状态,查看csi-rbdplugin-provisioner日志提示provision volume with StorageClass "rook-ceph-block": rpc error: code = DeadlineExceeded desc = context deadline exceeded`
```sh
K8S和rook-ceph的版本不匹配问题,我使用的是kubernetes 1.18.20之前安装rook-ceph 1.7.x就会报这个错,改成v1.6.10就可以了
```