兜兜    2021-09-29 18:10:11    2021-10-27 15:15:56   

kubernetes rancher
阅读 1547 评论 0 收藏 0
阅读 1547
评论 0
收藏 0

兜兜    2021-09-25 11:59:33    2022-01-25 09:21:19   

kubernetes ceph rook
#### 环境介绍 ```sh kubernetes 1.18.20 rook-ceph 1.6.10 ceph version 15.2.13 ``` `注意:kubernetes的节点要准备未使用的磁盘,供ceph集群使用。kubernetes版本要匹配对应的rook-ceph版本` #### 安装rook ##### 下载rook ```sh $ git clone --single-branch --branch v1.6.10 https://github.com/rook/rook.git #当前kubernetes 1.18.20不支持1.7.x ``` ##### 安装rook ```sh $ cd cluster/examples/kubernetes/ceph $ kubectl create -f crds.yaml -f common.yaml -f operator.yaml # verify the rook-ceph-operator is in the `Running` state before proceeding $ kubectl -n rook-ceph get pod ``` #### 安装ceph集群 ```sh $ kubectl create -f cluster.yaml ``` 查看集群状态 ```sh $ kubectl -n rook-ceph get pod NAME READY STATUS RESTARTS AGE csi-cephfsplugin-6tfqr 3/3 Running 0 6m45s csi-cephfsplugin-nldks 3/3 Running 0 6m45s csi-cephfsplugin-provisioner-6c59b5b7d9-lnddw 6/6 Running 0 6m44s csi-cephfsplugin-provisioner-6c59b5b7d9-lvdtc 6/6 Running 0 6m44s csi-cephfsplugin-zkt2x 3/3 Running 0 6m45s csi-rbdplugin-2kwhg 3/3 Running 0 6m46s csi-rbdplugin-7j9hw 3/3 Running 0 6m46s csi-rbdplugin-provisioner-6d455d5677-6qcn5 6/6 Running 0 6m46s csi-rbdplugin-provisioner-6d455d5677-8xp6r 6/6 Running 0 6m46s csi-rbdplugin-qxv4m 3/3 Running 0 6m46s rook-ceph-crashcollector-k8s-master1-7bf874dc98-v9bj9 1/1 Running 0 2m8s rook-ceph-crashcollector-k8s-master2-6698989df9-7hsfr 1/1 Running 0 2m16s rook-ceph-crashcollector-k8s-node1-8578585676-9tf5w 1/1 Running 0 2m16s rook-ceph-mgr-a-5c4759947f-47kbk 1/1 Running 0 2m19s rook-ceph-mon-a-647877db88-629bt 1/1 Running 0 6m56s rook-ceph-mon-b-c44f7978f-p2fq6 1/1 Running 0 5m11s rook-ceph-mon-c-588b48b74c-bbkn8 1/1 Running 0 3m42s rook-ceph-operator-64845bd768-55dkc 1/1 Running 0 13m rook-ceph-osd-0-64f9fc6c65-m7877 1/1 Running 0 2m8s rook-ceph-osd-1-584bf986c7-xj75p 1/1 Running 0 2m8s rook-ceph-osd-2-d59cdbd7f-xcwgf 1/1 Running 0 2m8s rook-ceph-osd-prepare-k8s-master1-vstkq 0/1 Completed 0 2m16s rook-ceph-osd-prepare-k8s-master2-zdjk2 0/1 Completed 0 2m16s rook-ceph-osd-prepare-k8s-node1-wmzj9 0/1 Completed 0 2m16s ``` #### 卸载ceph集群 https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md ```sh kubectl delete -f cluster.yaml ``` 清楚rook目录 ```sh $ rm /var/lib/rook/* -rf ``` #### 安装Rook toolbox ```SH $ kubectl create -f toolbox.yaml $ kubectl -n rook-ceph rollout status deploy/rook-ceph-tools $ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash ``` #### 安装dashboard ```sh $ kubectl -n rook-ceph get service ``` 获取登录密码 ```sh $ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo ``` #### 配置外网访问 ```sh $ kubectl create -f dashboard-external-https.yaml ``` ```sh $ kubectl -n rook-ceph get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE rook-ceph-mgr ClusterIP 10.101.238.49 <none> 9283/TCP 47m rook-ceph-mgr-dashboard ClusterIP 10.107.124.177 <none> 8443/TCP 47m rook-ceph-mgr-dashboard-external-https NodePort 10.102.110.16 <none> 8443:30870/TCP 14s ``` 访问:https://172.16.100.100:30870/ #### rook提供RBD服务 创建pool和StorageClass ```sh $ kubectl create -f cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml ``` ```sh $ kubectl apply -f rook/cluster/examples/kubernetes/mysql.yaml $ kubectl apply -f rook/cluster/examples/kubernetes/wordpress.yaml ``` ```sh kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cephfs-test-pvc-1 Bound pvc-571ae252-b080-4a67-8f3d-40bda1304fe3 500Mi RWX cephfs 2d23h mysql-pv-claim Bound pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c 20Gi RWO rook-ceph-block 15m test-claim Bound pvc-c51eef14-1454-4ddf-99e1-db483dec42c6 1Mi RWX managed-nfs-storage 3d5h wp-pv-claim Bound pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1 20Gi RWO rook-ceph-block 13m ``` 查看插件日志 ```sh $ kubectl logs csi-rbdplugin-provisioner-85c58fcfb4-nfnvd -n rook-ceph -c csi-provisioner I0925 07:13:27.761137 1 csi-provisioner.go:138] Version: v2.2.2 I0925 07:13:27.761189 1 csi-provisioner.go:161] Building kube configs for running in cluster... W0925 07:13:37.769467 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:13:47.769515 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:13:57.769149 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:07.769026 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:17.768966 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:27.768706 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:37.769526 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:47.769001 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock W0925 07:14:57.769711 1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock I0925 07:15:01.978682 1 common.go:111] Probing CSI driver for readiness I0925 07:15:01.980980 1 csi-provisioner.go:284] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments I0925 07:15:01.982474 1 leaderelection.go:243] attempting to acquire leader lease rook-ceph/rook-ceph-rbd-csi-ceph-com... I0925 07:15:02.001351 1 leaderelection.go:253] successfully acquired lease rook-ceph/rook-ceph-rbd-csi-ceph-com I0925 07:15:02.102184 1 controller.go:835] Starting provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-85c58fcfb4-nfnvd_c5346baf-02b2-4983-9405-45bbc31c216a! I0925 07:15:02.102243 1 volume_store.go:97] Starting save volume queue I0925 07:15:02.102244 1 clone_controller.go:66] Starting CloningProtection controller I0925 07:15:02.102299 1 clone_controller.go:84] Started CloningProtection controller I0925 07:15:02.202672 1 controller.go:884] Started provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-85c58fcfb4-nfnvd_c5346baf-02b2-4983-9405-45bbc31c216a! I0925 07:46:47.261147 1 controller.go:1332] provision "default/mysql-pv-claim" class "rook-ceph-block": started I0925 07:46:47.261389 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"mysql-pv-claim", UID:"6f6d70ba-6961-42b5-aa9d-db13c1aeec7c", APIVersion:"v1", ResourceVersion:"2798560", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/mysql-pv-claim" I0925 07:46:49.310382 1 controller.go:1439] provision "default/mysql-pv-claim" class "rook-ceph-block": volume "pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c" provisioned I0925 07:46:49.310419 1 controller.go:1456] provision "default/mysql-pv-claim" class "rook-ceph-block": succeeded I0925 07:46:49.319422 1 controller.go:1332] provision "default/mysql-pv-claim" class "rook-ceph-block": started I0925 07:46:49.319446 1 controller.go:1341] provision "default/mysql-pv-claim" class "rook-ceph-block": persistentvolume "pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c" already exists, skipping I0925 07:46:49.319615 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"mysql-pv-claim", UID:"6f6d70ba-6961-42b5-aa9d-db13c1aeec7c", APIVersion:"v1", ResourceVersion:"2798560", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-6f6d70ba-6961-42b5-aa9d-db13c1aeec7c I0925 07:48:52.230183 1 controller.go:1332] provision "default/wp-pv-claim" class "rook-ceph-block": started I0925 07:48:52.230656 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"wp-pv-claim", UID:"d5eb4142-532f-4704-9d15-94ffc7f35dd1", APIVersion:"v1", ResourceVersion:"2799324", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/wp-pv-claim" I0925 07:48:52.289747 1 controller.go:1439] provision "default/wp-pv-claim" class "rook-ceph-block": volume "pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1" provisioned I0925 07:48:52.289790 1 controller.go:1456] provision "default/wp-pv-claim" class "rook-ceph-block": succeeded I0925 07:48:52.295778 1 controller.go:1332] provision "default/wp-pv-claim" class "rook-ceph-block": started I0925 07:48:52.295798 1 controller.go:1341] provision "default/wp-pv-claim" class "rook-ceph-block": persistentvolume "pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1" already exists, skipping I0925 07:48:52.295844 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"wp-pv-claim", UID:"d5eb4142-532f-4704-9d15-94ffc7f35dd1", APIVersion:"v1", ResourceVersion:"2799324", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-d5eb4142-532f-4704-9d15-94ffc7f35dd1 ``` #### rook提供cephFS服务 ```sh $ kubectl create -f filesystem.yaml ``` ```sh $ kubectl -n rook-ceph get pod -l app=rook-ceph-mds NAME READY STATUS RESTARTS AGE rook-ceph-mds-myfs-a-6dd59747f5-2qjtk 1/1 Running 0 38s rook-ceph-mds-myfs-b-56488764f9-n6fzf 1/1 Running 0 37s ``` ```tex $ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash [root@rook-ceph-tools-5b85cb4766-8wj5c /]# ceph status cluster: id: 1c545199-7a68-46df-9b21-2a801c1ad1af health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 50m) mgr: a(active, since 70m) mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay #mds元数据服务 osd: 3 osds: 3 up (since 71m), 3 in (since 71m) data: pools: 4 pools, 97 pgs objects: 132 objects, 340 MiB usage: 4.0 GiB used, 296 GiB / 300 GiB avail pgs: 97 active+clean io: client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr ``` 创建存储类 ```sh $ cat >storageclass-cephfs.yaml <<EOF apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-cephfs # Change "rook-ceph" provisioner prefix to match the operator namespace if needed provisioner: rook-ceph.cephfs.csi.ceph.com parameters: # clusterID is the namespace where operator is deployed. clusterID: rook-ceph # CephFS filesystem name into which the volume shall be created fsName: myfs # Ceph pool into which the volume shall be created # Required for provisionVolume: "true" pool: myfs-data0 # The secrets contain Ceph admin credentials. These are generated automatically by the operator # in the same namespace as the cluster. csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph reclaimPolicy: Delete EOF ``` ```sh $ kubectl create -f storageclass-cephfs.yaml ``` 测试cephfs ```sh $ cat >test-cephfs-pvc.yaml <<EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: cephfs-pvc spec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi storageClassName: rook-cephfs EOF ``` ```sh $ kubectl apply -f test-cephfs-pvc.yaml ``` ```sh $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cephfs-pvc Bound pvc-4e200767-9ed9-4b47-8505-3081b8de6893 1Gi RWX rook-cephfs 8s ``` #### 问题 `1.[WRN] MON_CLOCK_SKEW: clock skew detected on mon.b ?` ```tex $ kubectl -n rook-ceph edit ConfigMap rook-config-override -o yaml config: | [global] mon clock drift allowed = 0.5 $ kubectl -n rook-ceph delete pod $(kubectl -n rook-ceph get pods -o custom-columns=NAME:.metadata.name --no-headers| grep mon) $ kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash [root@rook-ceph-tools-75cf595688-hrtmv /]# ceph -s cluster: id: 023baa6d-8ec1-4ada-bd17-219136f656b4 health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 2m) mgr: a(active, since 24m) osd: 3 osds: 3 up (since 25m), 3 in (since 25m) data: pools: 1 pools, 128 pgs objects: 0 objects, 0 B usage: 19 MiB used, 300 GiB / 300 GiB avail pgs: 128 active+clean ``` `2.type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-ceph-block": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-34167041-eb16-4f03-b01e-6b915089488d already exists` 删除下面的pod ```sh kubectl get po -n rook-ceph | grep csi-cephfsplugin-provisioner- kubectl get po -n rook-ceph | grep csi-rbdplugin-provisioner- kubectl get po -n rook-ceph | grep csi-rbdplugin- kubectl get po -n rook-ceph | grep csi-cephfsplugin- ``` `3.创建pvc一直pengding状态,查看csi-rbdplugin-provisioner日志提示provision volume with StorageClass "rook-ceph-block": rpc error: code = DeadlineExceeded desc = context deadline exceeded` ```sh K8S和rook-ceph的版本不匹配问题,我使用的是kubernetes 1.18.20之前安装rook-ceph 1.7.x就会报这个错,改成v1.6.10就可以了 ```
阅读 3168 评论 0 收藏 0
阅读 3168
评论 0
收藏 0

兜兜    2021-09-22 10:20:24    2022-01-25 09:20:25   

kubernetes k8s nginx ingress
#### 环境介绍 ```sh k8s版本:1.18.20 ingress-nginx: 3.4.0 ``` #### 安装ingress-nginx ##### 下载helm安装包 ```sh $ wget https://github.com/kubernetes/ingress-nginx/releases/download/ingress-nginx-3.4.0/ingress-nginx-3.4.0.tgz $ tar xvf ingress-nginx-3.4.0.tgz $ cd ingress-nginx ``` #### 配置参数 ```sh $ cat > values.yaml <<EOF controller: image: repository: registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller #更好阿里云镜像 tag: "v0.40.1" digest: sha256:abffcf2d25e3e7c7b67a315a7c664ec79a1588c9c945d3c7a75637c2f55caec6 pullPolicy: IfNotPresent runAsUser: 101 allowPrivilegeEscalation: true containerPort: http: 80 https: 443 config: {} configAnnotations: {} proxySetHeaders: {} addHeaders: {} dnsConfig: {} dnsPolicy: ClusterFirst reportNodeInternalIp: false hostNetwork: true #开启主机网络 hostPort: enabled: true #开启主机端口 ports: http: 80 https: 443 electionID: ingress-controller-leader ingressClass: nginx podLabels: {} podSecurityContext: {} sysctls: {} publishService: enabled: true pathOverride: "" scope: enabled: false tcp: annotations: {} udp: annotations: {} extraArgs: {} extraEnvs: [] kind: DaemonSet #DaemonSet运行 annotations: {} labels: {} updateStrategy: {} minReadySeconds: 0 tolerations: [] affinity: {} topologySpreadConstraints: [] terminationGracePeriodSeconds: 300 nodeSelector: ingress: nginx #配置部署选择ingress=nginx节点 livenessProbe: failureThreshold: 5 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 port: 10254 readinessProbe: failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 port: 10254 healthCheckPath: "/healthz" podAnnotations: {} #replicaCount: 1 #关闭 minAvailable: 1 resources: requests: cpu: 100m memory: 90Mi autoscaling: enabled: false minReplicas: 1 maxReplicas: 11 targetCPUUtilizationPercentage: 50 targetMemoryUtilizationPercentage: 50 autoscalingTemplate: [] enableMimalloc: true customTemplate: configMapName: "" configMapKey: "" service: enabled: true annotations: {} labels: {} externalIPs: [] loadBalancerSourceRanges: [] enableHttp: true enableHttps: true ports: http: 80 https: 443 targetPorts: http: http https: https type: LoadBalancer nodePorts: http: "" https: "" tcp: {} udp: {} internal: enabled: false annotations: {} extraContainers: [] extraVolumeMounts: [] extraVolumes: [] extraInitContainers: [] admissionWebhooks: enabled: false #关闭 failurePolicy: Fail port: 8443 service: annotations: {} externalIPs: [] loadBalancerSourceRanges: [] servicePort: 443 type: ClusterIP patch: enabled: true image: repository: docker.io/jettech/kube-webhook-certgen tag: v1.3.0 pullPolicy: IfNotPresent priorityClassName: "" podAnnotations: {} nodeSelector: {} tolerations: [] runAsUser: 2000 metrics: port: 10254 enabled: false service: annotations: {} externalIPs: [] loadBalancerSourceRanges: [] servicePort: 9913 type: ClusterIP serviceMonitor: enabled: false additionalLabels: {} namespace: "" namespaceSelector: {} scrapeInterval: 30s targetLabels: [] metricRelabelings: [] prometheusRule: enabled: false additionalLabels: {} rules: [] lifecycle: preStop: exec: command: - /wait-shutdown priorityClassName: "" revisionHistoryLimit: 10 maxmindLicenseKey: "" defaultBackend: enabled: false image: repository: k8s.gcr.io/defaultbackend-amd64 tag: "1.5" pullPolicy: IfNotPresent runAsUser: 65534 extraArgs: {} serviceAccount: create: true name: extraEnvs: [] port: 8080 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 readinessProbe: failureThreshold: 6 initialDelaySeconds: 0 periodSeconds: 5 successThreshold: 1 timeoutSeconds: 5 tolerations: [] affinity: {} podSecurityContext: {} podLabels: {} nodeSelector: {} podAnnotations: {} replicaCount: 1 minAvailable: 1 resources: {} service: annotations: {} externalIPs: [] loadBalancerSourceRanges: [] servicePort: 80 type: ClusterIP priorityClassName: "" rbac: create: true scope: false podSecurityPolicy: enabled: false serviceAccount: create: true name: imagePullSecrets: [] tcp: {} udp: {} EOF ``` #### 创建命名空间 ```sh $ kubectl create namespace ingress-nginx ``` #### 节点打标签 ```sh $ kubectl label nodes k8s-master1 ingress=nginx $ kubectl label nodes k8s-master2 ingress=nginx $ kubectl label nodes k8s-node1 ingress=nginx ``` #### 安装nginx-ingress ```sh $ helm -n ingress-nginx upgrade -i ingress-nginx . ``` #### 卸载ingress-nginx ```sh $ helm -n ingress-nginx uninstall ingress-nginx ``` #### 测试nginx-ingress #### 部署一个测试nginx服务 ```sh $ cat > nginx-deployment.yml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deploy spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 type: ClusterIP EOF ``` 配置ingress对象 创建TLS证书 ```sh $ kubectl create secret tls shudoon-com-tls --cert=5024509__example.com.pem --key=5024509__example.com.key ``` #### 创建ingress规则 ```sh $ cat >tnginx-ingress.yaml <<EOF apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: annotations: kubernetes.io/ingress.class: nginx name: tnginx-ingress spec: rules: - host: tnginx.example.com http: paths: - path: / backend: serviceName: nginx-service servicePort: 80 # This section is only required if TLS is to be enabled for the Ingress tls: - hosts: - tnginx.example.com secretName: shudoon-com-tls EOF ``` 测试 https://tnginx.example.com/ #### 问题 `问题:创建自定义ingress报错:Internal error occurred: failed calling webhook “validate.nginx.ingress.kubernetes.io` 查看策略 ```sh $ kubectl get validatingwebhookconfigurations ``` 删除策略 ```sh $ kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission ```
阅读 2636 评论 0 收藏 0
阅读 2636
评论 0
收藏 0

兜兜    2021-09-16 00:45:44    2022-01-25 09:20:56   

kubernetes nfs
```sh Kubernetes提供了一套可以自动创建PV的机制,即:Dynamic Provisioning.而这个机制的核心在于:StorageClass这个API对象. StorageClass对象会定义下面两部分内容: 1,PV的属性.比如,存储类型,Volume的大小等. 2,创建这种PV需要用到的存储插件 有了这两个信息之后,Kubernetes就能够根据用户提交的PVC,找到一个对应的StorageClass,之后Kubernetes就会调用该StorageClass声明的存储插件,进而创建出需要的PV. 但是其实使用起来是一件很简单的事情,你只需要根据自己的需求,编写YAML文件即可,然后使用kubectl create命令执行即可 ``` 搭建StorageClass+NFS,大致有以下几个步骤: ```sh 1.创建一个可用的NFS Serve 2.创建Service Account.这是用来管控NFS provisioner在k8s集群中运行的权限 3.创建StorageClass.负责建立PVC并调用NFS provisioner进行预定的工作,并让PV与PVC建立管理 4.创建NFS provisioner.有两个功能,一个是在NFS共享目录下创建挂载点(volume),另一个则是建了PV并将PV与NFS的挂载点建立关联 ``` #### 创建StorageClass 1.创建NFS共享服务 当前环境NFS server及共享目录信息 ```sh IP: 172.16.100.100 PATH: /data/k8s ``` 2.创建ServiceAccout账号 ```sh $ cat > rbac.yaml <<EOF apiVersion: v1 kind: ServiceAccount metadata: name: nfs-client-provisioner namespace: default --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: nfs-client-provisioner-runner rules: - apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete"] - apiGroups: [""] resources: ["persistentvolumeclaims"] verbs: ["get", "list", "watch", "update"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"] - apiGroups: [""] resources: ["events"] verbs: ["create", "update", "patch"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: run-nfs-client-provisioner subjects: - kind: ServiceAccount name: nfs-client-provisioner # replace with namespace where provisioner is deployed namespace: default roleRef: kind: ClusterRole name: nfs-client-provisioner-runner apiGroup: rbac.authorization.k8s.io --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner # replace with namespace where provisioner is deployed namespace: default rules: - apiGroups: [""] resources: ["endpoints"] verbs: ["get", "list", "watch", "create", "update", "patch"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner subjects: - kind: ServiceAccount name: nfs-client-provisioner # replace with namespace where provisioner is deployed namespace: default roleRef: kind: Role name: leader-locking-nfs-client-provisioner apiGroup: rbac.authorization.k8s.io EOF ``` ```sh $ kubectl apply -f rbac.yaml ``` 3.创建NFS provisioner ```sh $ cat > nfs-provisioner.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: nfs-client-provisioner labels: app: nfs-client-provisioner # replace with namespace where provisioner is deployed namespace: default #与RBAC文件中的namespace保持一致 spec: replicas: 1 selector: matchLabels: app: nfs-client-provisioner strategy: type: Recreate selector: matchLabels: app: nfs-client-provisioner template: metadata: labels: app: nfs-client-provisioner spec: serviceAccountName: nfs-client-provisioner containers: - name: nfs-client-provisioner image: quay.io/external_storage/nfs-client-provisioner:latest volumeMounts: - name: nfs-client-root mountPath: /persistentvolumes env: - name: PROVISIONER_NAME value: nfs-storage-provisioner - name: NFS_SERVER value: 172.16.100.100 - name: NFS_PATH value: /data/k8s volumes: - name: nfs-client-root nfs: server: 172.16.100.100 path: /data/k8s EOF ``` ```sh $ kubectl apply -f nfs-provisioner.yaml ``` 4.创建storageclass ```sh $ cat >nfs-storageclass.yaml <<EOF apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: managed-nfs-storage provisioner: nfs-storage-provisioner EOF ``` ```sh $ kubectl apply -f nfs-storageclass.yaml ``` 5.测试创建pvc ```sh $ cat > test-pvc.yaml <<EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: test-claim annotations: volume.beta.kubernetes.io/storage-class: "managed-nfs-storage" #指定上面创建的存储类 spec: accessModes: - ReadWriteMany resources: requests: storage: 1Mi EOF ``` ```sh $ kubectl apply -f test-pvc.yaml ``` 查看pv ```sh kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-66967ce3-5adf-46cf-9fa3-8379f499d254 1Mi RWX Delete Bound default/test-claim managed-nfs-storage 27m ``` 查看pvc ```sh $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE test-claim Bound pvc-66967ce3-5adf-46cf-9fa3-8379f499d254 1Mi RWX managed-nfs-storage 27m ``` 创建nginx-statefulset.yaml ```sh $ cat >nginx-statefulset.yaml <<EOF --- apiVersion: v1 kind: Service metadata: name: nginx-headless labels: app: nginx spec: ports: - port: 80 name: web clusterIP: None selector: app: nginx --- apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: selector: matchLabels: app: nginx serviceName: "nginx" replicas: 2 template: metadata: labels: app: nginx spec: containers: - name: nginx image: ikubernetes/myapp:v1 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www annotations: volume.beta.kubernetes.io/storage-class: "managed-nfs-storage" spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 20Mi EOF ``` ```sh $ kubectl apply -f nginx-statefulset.yaml ``` 查看pod ```sh $ kubectl get pods -l app=nginx NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 21m web-1 1/1 Running 0 21m ``` 查看pv ```sh $kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-2c23ca4d-7c98-4021-8099-c208b418aa5b 20Mi RWO Delete Bound default/www-web-0 managed-nfs-storage 22m pvc-4d8da9d3-cb51-49e6-bd0e-c5bad7878551 20Mi RWO Delete Bound default/www-web-1 managed-nfs-storage 21m pvc-66967ce3-5adf-46cf-9fa3-8379f499d254 1Mi RWX Delete Bound default/test-claim managed-nfs-storage 32m ``` 查看pvc ```sh $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE test-claim Bound pvc-66967ce3-5adf-46cf-9fa3-8379f499d254 1Mi RWX managed-nfs-storage 32m www-web-0 Bound pvc-2c23ca4d-7c98-4021-8099-c208b418aa5b 20Mi RWO managed-nfs-storage 22m www-web-1 Bound pvc-4d8da9d3-cb51-49e6-bd0e-c5bad7878551 20Mi RWO managed-nfs-storage 21m ``` 参考:https://www.cnblogs.com/panwenbin-logs/p/12196286.html
阅读 1059 评论 0 收藏 0
阅读 1059
评论 0
收藏 0

兜兜    2021-09-15 15:55:33    2022-01-25 09:21:07   

kubernetes
部署生产环境Kubernetes集群方式 ```sh kubeadm Kubeadm是一个K8s部署工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。 二进制包 从github下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。 ``` 本次测试采用kubeadm搭建集群 架构图 ```sh hostport/nodeport vip:172.16.100.99 +-----------------+ +-----------------+ | keepalived | |K8S-Master1 | | | | | 119.x.x.x | +---------+ | +------->| | | | | | | | | | | | | | | | +--------+ +--------+--> lvs27 | | | +-----------------+ | | | | | | | | 172.16.100.100 +---------+ | | | | +---------+ | | | | | | | | | | +-----------------+ | Client +--------+firewall+---+ | 172.16.100.27 | | |K8S-Master2 | | | | | | | | | | | +---------+ | | | | +---------+ +---------+------->| | | | | | | | | | | | | | | | | | | | | | +--------+ +--------+--> lvs28 | | | +-----------------+ | | | | | 172.16.100.101 | +---------+ | | +-----------------+ | | | |K8S-Node1 | | 172.16.100.2 | | | | | | | | | | | +------->| | +-----------------+ | | +-----------------+ 172.16.100.102 ``` #### 部署高可用高可用负载均衡器 ```sh Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法实现将Pod分布式部署,并保持预期副本数,根据Node失效状态自动在其他Node拉起Pod,实现了应用层的高可用性。 针对Kubernetes集群,高可用性还应包含以下两个层面的考虑:Etcd数据库的高可用性和Kubernetes Master组件的高可用性。 而kubeadm搭建的K8s集群,Etcd只起了一个,存在单点,所以我们这里会独立搭建一个Etcd集群。 Master节点扮演着总控中心的角色,通过不断与工作节点上的Kubelet和kube-proxy进行通信来维护整个集群的健康工作状态。如果Master节点故障,将无法使用kubectl工具或者API做任何集群管理。 Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube-controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,增加负载均衡器对其负载均衡即可,并且可水平扩容。 ``` [查看安装文档](https://ynotes.cn/blog/article_detail/280) #### 部署Etcd集群 准备cfssl证书生成工具 cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。 找任意一台服务器操作,这里用Master节点。 ```sh $ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 $ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 $ wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 $ chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64 $ mv cfssl_linux-amd64 /usr/local/bin/cfssl $ mv cfssljson_linux-amd64 /usr/local/bin/cfssljson $ mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo ``` 生成Etcd证书 ```sh $ mkdir -p ~/etcd_tls $ cd ~/etcd_tls ``` ```sh $ cat > ca-config.json << EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "www": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF ``` ```sh $ cat > ca-csr.json << EOF { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Guangdong", "ST": "Guangzhou" } ] } EOF ``` 生成CA证书 ```sh $ cfssl gencert -initca ca-csr.json | cfssljson -bare ca - ``` 使用自签CA签发Etcd HTTPS证书 创建证书申请文件 ```sh $ cat > server-csr.json << EOF { "CN": "etcd", "hosts": [ "172.16.100.100", "172.16.100.101", "172.16.100.102", "172.16.100.103", "172.16.100.104" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Guangdong", "ST": "Guangzhou" } ] } EOF ``` 生成证书 ```sh $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | $ cfssljson -bare server ``` 下载二进制文件 ```sh $ wget https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz ``` ```sh $ mkdir /opt/etcd/{bin,cfg,ssl} -p $ tar zxvf etcd-v3.4.9-linux-amd64.tar.gz $ mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/ ``` 创建etcd配置文件 ```sh cat > /opt/etcd/cfg/etcd.conf << EOF #[Member] ETCD_NAME="etcd-1" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://172.16.100.100:2380" ETCD_LISTEN_CLIENT_URLS="https://172.16.100.100:2379" #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.100.100:2380" ETCD_ADVERTISE_CLIENT_URLS="https://172.16.100.100:2379" ETCD_INITIAL_CLUSTER="etcd-1=https://172.16.100.100:2380,etcd-2=https://172.16.100.101:2380,etcd-3=https://172.16.100.102:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new" EOF ``` systemd管理etcd ```sh cat > /usr/lib/systemd/system/etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/opt/etcd/cfg/etcd.conf ExecStart=/opt/etcd/bin/etcd --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --peer-cert-file=/opt/etcd/ssl/server.pem --peer-key-file=/opt/etcd/ssl/server-key.pem --trusted-ca-file=/opt/etcd/ssl/ca.pem --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem --logger=zap Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF ``` 拷贝刚才生成的证书 ```sh $ cp ~/etcd_tls/ca*pem ~/etcd_tls/server*pem /opt/etcd/ssl/ ``` 将上面节点1所有生成的文件拷贝到节点2和节点3 ```sh $ scp -r /opt/etcd/ root@172.16.100.101:/opt/ $ scp -r /opt/etcd/ root@172.16.100.102:/opt/ $ scp -r /usr/lib/systemd/system/etcd.service root@172.16.100.102:/usr/lib/systemd/system/ $ scp -r /usr/lib/systemd/system/etcd.service root@172.16.100.101:/usr/lib/systemd/system/ ``` 三个节点同时启动 ```sh $ systemctl daemon-reload $ systemctl start etcd $ systemctl enable etcd ``` ```sh $ ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://172.16.100.100:2379,https://172.16.100.101:2379,https://172.16.100.102:2379" endpoint health --write-out=table ``` ```sh +-----------------------------+--------+-------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +-----------------------------+--------+-------------+-------+ | https://172.16.100.100:2379 | true | 12.812954ms | | | https://172.16.100.102:2379 | true | 13.596982ms | | | https://172.16.100.101:2379 | true | 14.607151ms | | +-----------------------------+--------+-------------+-------+ ``` #### 安装Docker [查看Docker安装部分](https://ynotes.cn/blog/article_detail/271) #### 安装kubeadm、kubelet、kubectl ```sh $ cat > /etc/yum.repos.d/kubernetes.repo <<EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF ``` 安装kubeadm、kubelet、kubectl ```sh $ yum install -y kubelet-1.18.20 kubeadm-1.18.20 kubectl-1.18.20 $ systemctl enable kubelet ``` 集群配置文件 ```sh $ cat > kubeadm-config.yaml <<EOF apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 172.16.100.100 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: k8s-master1 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: certSANs: - k8s-master1 - k8s-master2 - k8s-master3 - 172.16.100.100 - 172.16.100.101 - 172.16.100.102 - 172.16.100.111 - 127.0.0.1 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: 172.16.100.111:16443 controllerManager: {} dns: type: CoreDNS etcd: external: endpoints: - https://172.16.100.100:2379 - https://172.16.100.101:2379 - https://172.16.100.102:2379 caFile: /opt/etcd/ssl/ca.pem certFile: /opt/etcd/ssl/server.pem keyFile: /opt/etcd/ssl/server-key.pem imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.18.20 networking: dnsDomain: cluster.local podSubnet: 10.244.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} EOF ``` 集群初始化(`k8s-master1`) ```sh $ kubeadm init --config kubeadm-config.yaml ``` ```sh Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join 172.16.100.111:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:466694e2952961d35ca66960df917b65a5bd5da6219780274603deeb52b2cdde \ --control-plane Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.16.100.111:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:466694e2952961d35ca66960df917b65a5bd5da6219780274603deeb52b2cdde ``` 配置访问集群的配置 ```sh $ mkdir -p $HOME/.kube $ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config $ sudo chown $(id -u):$(id -g) $HOME/.kube/config ``` 加入集群(`k8s-master2`) 拷贝k8s-master1证书 ```sh $ scp -r 172.16.100.100:/etc/kubernetes/pki/ /etc/kubernetes/ ``` ```sh $ kubeadm join 172.16.100.111:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:466694e2952961d35ca66960df917b65a5bd5da6219780274603deeb52b2cdde \ --control-plane ``` 加入集群(`k8s-node1`) ```sh $ kubeadm join 172.16.100.111:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:466694e2952961d35ca66960df917b65a5bd5da6219780274603deeb52b2cdde ``` 修改主节点可部署pod ```sh $ kubectl taint node k8s-master1 node-role.kubernetes.io/master- $ kubectl taint node k8s-master2 node-role.kubernetes.io/master- ``` #### 部署网络 `方式一:calico(BGP/IPIP)`:比较主流的网络组件,BGP模式性能最好,缺点是pod无法和非BGP节点通信,IPIP无网络限制,相比BGP性能略差。 [参考:k8s配置calico网络](https://ynotes.cn/blog/article_detail/274) `方式二:flannel(vxlan/host-gw)`:host-gw性能最好,缺点是pod无法和非集群节点通信。vxlan可跨网络,相比host-gw性能略差。 [参考:K8S-kubeadm部署k8s集群(一) flannel部分](https://ynotes.cn/blog/article_detail/271) `方式三:cilium(eBFP)+kube-router(BGP路由)`:三种方式性能最好,缺点pod和非BGP节点无法通信,目前是Beta阶段,生产不推荐。 [参考:k8s配置cilium网络](https://ynotes.cn/blog/article_detail/286) #### 重置K8S集群 ##### kubeadm重置节点 ```sh $ kubeadm reset #kubeadm reset 负责从使用 kubeadm init 或 kubeadm join 命令创建的文件中清除节点本地文件系统。对于控制平面节点,reset 还从 etcd 集群中删除该节点的本地 etcd 堆成员,还从 kubeadm ClusterStatus 对象中删除该节点的信息。 ClusterStatus 是一个 kubeadm 管理的 Kubernetes API 对象,该对象包含 kube-apiserver 端点列表。 #kubeadm reset phase 可用于执行上述工作流程的各个阶段。 要跳过阶段列表,您可以使用 --skip-phases 参数,该参数的工作方式类似于 kubeadm join 和 kubeadm init 阶段运行器。 ``` ##### 清理网络配置(flannel) ```sh $ systemctl stop kubelet $ systemctl stop docker $ rm /etc/cni/net.d/* -rf $ ifconfig cni0 down $ ifconfig flannel.1 down $ ifconfig docker0 down $ ip link delete cni0 $ ip link delete flannel.1 $ systemctl start docker ``` ##### 清理iptables ```sh $ iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X ``` ##### 清理ipvs ```sh $ ipvsadm --clear ``` ##### 清理外部 etcd _`如果使用了外部 etcd,kubeadm reset 将不会删除任何 etcd 中的数据。这意味着,如果再次使用相同的 etcd 端点运行 kubeadm init,您将看到先前集群的状态。`_ 要清理 etcd 中的数据,建议您使用 etcdctl 这样的客户端,例如: ```sh $ etcdctl del "" --prefix ```
阅读 646 评论 0 收藏 0
阅读 646
评论 0
收藏 0

兜兜    2021-09-08 14:00:24    2022-01-25 09:20:51   

kubernetes calico flannel
Calico网络 ```sh Calico主要由三个部分组成: Felix:以DaemonSet方式部署,运行在每一个Node节点上,主要负责维护宿主机上路由规则以及ACL规则。 BGP Client(BIRD):主要负责把 Felix 写入 Kernel 的路由信息分发到集群 Calico 网络。 Etcd:分布式键值存储,保存Calico的策略和网络配置状态。 calicoctl:允许您从简单的命令行界面实现高级策略和网络。 ``` #### 一、卸载flannel 1.1 k8s删除flannel pod ```sh kubectl delete -f kube-flanneld.yaml ``` 1.2 删除flannel网卡 ```sh $ ip link delete cni0 $ ip link delete flannel.1 #删除flannel网卡,如果是udp模式,则网卡为:flannel.0 ``` 1.3 查看路由 ```sh ip route ``` ```sh default via 172.16.13.254 dev ens192 proto static metric 100 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 10.244.1.0/24 via 172.16.13.81 dev ens192 10.244.2.0/24 via 172.16.13.82 dev ens192 172.16.13.0/24 dev ens192 proto kernel scope link src 172.16.13.80 metric 100 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 ``` 1.4 删除路由 ```sh ip route delete 10.244.1.0/24 via 172.16.13.81 dev ens192 ip route delete 10.244.2.0/24 via 172.16.13.82 dev ens192 ``` `注意:不要清除防火墙规则` #### 二、安装calico ##### 2.1 下载calico安装文件 ```sh wget https://docs.projectcalico.org/manifests/calico-etcd.yaml ``` ##### 2.2 修改etcd证书 获取证书和秘钥的base64编码结果 ```sh $ cat /etc/kubernetes/pki/etcd/ca.crt |base64 -w 0 $ cat /etc/kubernetes/pki/etcd/server.crt |base64 -w 0 $ cat /etc/kubernetes/pki/etcd/server.key |base64 -w 0 ``` 修改calico-etcd.yaml ```sh $ vim calico-etcd.yaml ``` ```yaml --- apiVersion: v1 kind: Secret type: Opaque metadata: name: calico-etcd-secrets namespace: kube-system data: etcd-ca: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM0VENDQWNtZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFTTVJBd0RnWURWUVFERXdkbGRHTmsKTFdOaE1CNFhEVEl4TURrd05EQXl..." #cat /etc/kubernetes/pki/ca.crt |base64 -w 0 输出的结果 etcd-cert: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURQVENDQWlXZ0F3SUJBZ0lJUmsrNkR4Szdrb0V3RFFZSktvWklodmNOQVFFTEJRQXdFakVRTUE0R0ExVUUKQXhNSFpYUmpaQzFqWVRBZUZ3M" #cat /etc/kubernetes/pki/server.crt |base64 -w 0 输出的结果 etcd-key: "LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBdFNTTHBDMUxyWjdTcTBCTmh5UjlTYi83OThXTHJxNHNoZUUzc2RKQVA2UzJpR0VxCnBtUVh" #cat /etc/kubernetes/pki/server.crt |base64 -w 0 输出的结果 --- kind: ConfigMap apiVersion: v1 metadata: name: calico-config namespace: kube-system data: etcd_endpoints: "https://172.16.13.80:2379" #etc的地址 etcd_ca: "/calico-secrets/etcd-ca" etcd_cert: "/calico-secrets/etcd-cert" etcd_key: "/calico-secrets/etcd-key" ... - name: IP_AUTODETECTION_METHOD value: "interface=ens.*" #修改查找节点网卡名的匹配规则 # Auto-detect the BGP IP address. - name: IP value: "autodetect" # Enable IPIP - name: CALICO_IPV4POOL_IPIP value: "Never" #Always表示IPIP模式,修改为Never,启用BGP - name: CALICO_IPV4POOL_CIDR value: "10.244.0.0/16" #修改CIDR的子网 ... ``` 部署calico ```sh $ kubectl apply -f calico-etcd.yaml ``` 查看calico的pod状态 ```sh $ kubectl get pods -n kube-system ``` ```sh kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-5499fb6db5-w4b4z 1/1 Running 0 15h calico-node-569mh 1/1 Running 0 15h calico-node-g6m6j 1/1 Running 0 15h calico-node-g7p7w 1/1 Running 0 15h coredns-7f89b7bc75-gdzxn 0/1 pending 13 4d3h coredns-7f89b7bc75-s5shx 0/1 pending 13 4d3h etcd-shudoon101 1/1 Running 1 4d3h kube-apiserver-shudoon101 1/1 Running 1 4d3h kube-controller-manager-shudoon101 1/1 Running 1 4d3h kube-proxy-dpvzs 1/1 Running 0 4d2h kube-proxy-svckb 1/1 Running 1 4d3h kube-proxy-xlqvh 1/1 Running 0 4d2h kube-scheduler-shudoon101 1/1 Running 2 4d3h ``` 重建coredns(网络不通) ```sh $ kubectl get deployment -n kube-system -o yaml >coredns.yaml $ kubectl delete -f coredns.yaml $ kubectl apply -f coredns.yaml ``` 查看coredns网络信息 ```sh $ kubectl get pods -n kube-system -o wide ``` ```sh NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-kube-controllers-5499fb6db5-79g4k 1/1 Running 0 3m6s 172.16.13.80 shudoon101 <none> <none> calico-node-hr46s 1/1 Running 0 90m 172.16.13.81 shudoon102 <none> <none> calico-node-n5h78 1/1 Running 0 90m 172.16.13.82 shudoon103 <none> <none> calico-node-vmrbq 1/1 Running 0 90m 172.16.13.80 shudoon101 <none> <none> coredns-7f89b7bc75-c874x 1/1 Running 0 3m6s 10.244.236.192 shudoon101 <none> <none> coredns-7f89b7bc75-ssv86 1/1 Running 0 3m6s 10.244.236.193 shudoon101 <none> <none> etcd-shudoon101 1/1 Running 0 125m 172.16.13.80 shudoon101 <none> <none> kube-apiserver-shudoon101 1/1 Running 0 125m 172.16.13.80 shudoon101 <none> <none> kube-controller-manager-shudoon101 1/1 Running 0 125m 172.16.13.80 shudoon101 <none> <none> kube-proxy-fbbkw 1/1 Running 0 124m 172.16.13.81 shudoon102 <none> <none> kube-proxy-mrghg 1/1 Running 0 125m 172.16.13.80 shudoon101 <none> <none> kube-proxy-t7555 1/1 Running 0 124m 172.16.13.82 shudoon103 <none> <none> kube-scheduler-shudoon101 1/1 Running 0 125m 172.16.13.80 shudoon101 <none> <none> ``` ```sh $ ip route ``` ```sh default via 172.16.13.254 dev ens192 proto static metric 100 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 10.244.1.0/24 via 172.16.13.81 dev ens192 proto bird 10.244.2.0/24 via 172.16.13.82 dev ens192 proto bird 10.244.193.128/26 via 172.16.13.82 dev ens192 proto bird 10.244.202.0/26 via 172.16.13.81 dev ens192 proto bird 10.244.236.192 dev calif9ec1619c50 scope link blackhole 10.244.236.192/26 proto bird 10.244.236.193 dev califa55073ed4c scope link 172.16.13.0/24 dev ens192 proto kernel scope link src 172.16.13.80 metric 100 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 ``` #### 三、安装calicoctl ```sh $ wget -O /usr/local/bin/calicoctl https://github.com/projectcalico/calicoctl/releases/download/v3.11.1/calicoctl $ chmod +x /usr/local/bin/calicoctl ``` 创建配置文件 ```sh $ mkdir /etc/calico $ vim /etc/calico/calicoctl.cfg ``` ```yaml apiVersion: projectcalico.org/v3 kind: CalicoAPIConfig metadata: spec: datastoreType: "etcdv3" etcdEndpoints: "https://172.16.13.80:2379" etcdKeyFile: "/etc/kubernetes/pki/etcd/server.key" etcdCertFile: "/etc/kubernetes/pki/etcd/server.crt" etcdCACertFile: "/etc/kubernetes/pki/etcd/ca.crt" ``` ```sh $ calicoctl node status ``` ```sh Calico process is running. IPv4 BGP status +--------------+-------------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +--------------+-------------------+-------+----------+-------------+ | 172.16.13.81 | node-to-node mesh | up | 14:33:24 | Established | | 172.16.13.82 | node-to-node mesh | up | 14:33:25 | Established | +--------------+-------------------+-------+----------+-------------+ IPv6 BGP status No IPv6 peers found. ``` #### 四、配置Route Reflector模式 ```sh Calico 维护的网络在默认是(Node-to-Node Mesh)全互联模式,Calico集群中的节点之间都会相互建立连接,用于路由交换。但是随着集群规模的扩大,mesh模式将形成一个巨大服务网格,连接数成倍增加。 这时就需要使用 Route Reflector(路由器反射)模式解决这个问题。 确定一个或多个Calico节点充当路由反射器,让其他节点从这个RR节点获取路由信息。 在BGP中可以通过calicoctl node status看到启动是node-to-node mesh网格的形式,这种形式是一个全互联的模式,默认的BGP在k8s的每个节点担任了一个BGP的一个喇叭,一直吆喝着扩散到其他节点,随着集群节点的数量的增加,那么上百台节点就要构建上百台链接,就是全互联的方式,都要来回建立连接来保证网络的互通性,那么增加一个节点就要成倍的增加这种链接保证网络的互通性,这样的话就会使用大量的网络消耗,所以这时就需要使用Route reflector,也就是找几个大的节点,让他们去这个大的节点建立连接,也叫RR,也就是公司的员工没有微信群的时候,找每个人沟通都很麻烦,那么建个群,里面的人都能收到,所以要找节点或着多个节点充当路由反射器,建议至少是2到3个,一个做备用,一个在维护的时候不影响其他的使用。 ``` ##### 4.1 关闭 node-to-node BGP网格 添加default BGP配置,调整nodeToNodeMeshEnabled和asNumber: ```sh $ cat bgp.yaml ``` ```yaml apiVersion: projectcalico.org/v3 kind: BGPConfiguration metadata: name: default spec: logSeverityScreen: Info nodeToNodeMeshEnabled: false #禁用node-to-node mesh asNumber: 64512 #calicoctl get nodes --output=wide 获取 ``` ##### 4.2 查看bgp配置,MESHENABLED为false ```sh $ calicoctl get bgpconfig ``` ```sh NAME LOGSEVERITY MESHENABLED ASNUMBER default Info false 64512 ``` ##### 4.3 配置指定节点充当路由反射器 为方便让BGPPeer轻松选择节点,通过标签选择器匹配,也就是可以去调用k8s里面的标签进行关联,我们可以给哪个节点作为路由发射器打个标签 给路由器反射器节点打标签,我这将shudoon102打上标签 ```sh $ kubectl label node shudoon102 route-reflector=true ``` ##### 4.4 配置路由器反射器节点,配置集群ID ```sh $ calicoctl get node shudoon102 -o yaml > node.yaml ``` ```yml apiVersion: projectcalico.org/v3 kind: Node metadata: annotations: projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"shudoon102","kubernetes.io/os":"linux","route-reflector":"true"}' creationTimestamp: "2021-09-09T03:54:27Z" labels: beta.kubernetes.io/arch: amd64 beta.kubernetes.io/os: linux kubernetes.io/arch: amd64 kubernetes.io/hostname: shudoon102 kubernetes.io/os: linux route-reflector: "true" name: shudoon102 resourceVersion: "27093" uid: 3c1f56e8-4c35-46d7-9f46-ef20984fec41 spec: bgp: ipv4Address: 172.16.13.81/24 routeReflectorClusterID: 244.0.0.1 #集群ID orchRefs: - nodeName: shudoon102 orchestrator: k8s ``` 应用配置 ```sh $ calicoctl apply -f node.yaml ``` ##### 4.5 配置其他节点连接反射器 ```sh $ cat bgp1.yaml ``` ```yml apiVersion: projectcalico.org/v3 kind: BGPPeer metadata: name: peer-with-route-reflectors spec: nodeSelector: all() #所有节点 peerSelector: route-reflector == 'true' ``` ```sh $ calicoctl apply -f bgp1.yaml Successfully applied 1 'BGPPeer' resource(s) ``` ```sh $ calicoctl get bgppeer NAME PEERIP NODE ASN peer-with-route-reflectors all() 0 ``` ```sh $ calicoctl node status ``` ```sh Calico process is running. IPv4 BGP status +--------------+---------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +--------------+---------------+-------+----------+-------------+ | 172.16.13.81 | node specific | up | 08:50:34 | Established | +--------------+---------------+-------+----------+-------------+ ``` ##### 4.6 添加多个路由反射器 现在进行对路由反射器添加多个,100个节点以内建议2-3个路由反射器 ```sh $ kubectl label node shudoon103 route-reflector=true ``` ```sh $ calicoctl get node shudoon103 -o yaml > node2.yaml ``` ```yml apiVersion: projectcalico.org/v3 kind: Node metadata: annotations: projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"shudoon103","kubernetes.io/os":"linux","route-reflector":"true"}' creationTimestamp: "2021-09-09T03:54:28Z" labels: beta.kubernetes.io/arch: amd64 beta.kubernetes.io/os: linux kubernetes.io/arch: amd64 kubernetes.io/hostname: shudoon103 kubernetes.io/os: linux route-reflector: "true" name: shudoon103 resourceVersion: "29289" uid: da510109-75bf-4e92-9074-409f1de496b9 spec: bgp: ipv4Address: 172.16.13.82/24 routeReflectorClusterID: 244.0.0.1 #添加集群ID orchRefs: - nodeName: shudoon103 orchestrator: k8s ``` 应用配置 ```sh $ calicoctl apply -f node.yaml ``` 查看calico节点状态 ```sh $calicoctl node status ``` ```sh Calico process is running. IPv4 BGP status +--------------+---------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +--------------+---------------+-------+----------+-------------+ | 172.16.13.81 | node specific | up | 08:50:34 | Established | | 172.16.13.82 | node specific | up | 08:54:47 | Established | +--------------+---------------+-------+----------+-------------+ ``` #### 问题: `a.如何卸载calico网络?` ```sh $ kubectl delete -f calico-etcd.yaml ``` 参考: https://blog.51cto.com/u_14143894/2463392
阅读 870 评论 0 收藏 0
阅读 870
评论 0
收藏 0

兜兜    2021-08-26 18:05:39    2021-08-26 18:09:23   

kubernetes k8s
#### 1.查看pod的docker容器ID ```bash $ kubectl get pod -n kube-system nginx-ingress-controller-8489c5b8c4-fccs5 -o json ``` ```yaml ... "containerStatuses": [ { "containerID": "docker://eda6562ab8d504599016d4dba8ceea4b9b255dbf97446f044ce89ad4410ab49a", "image": "registry-vpc.cn-shenzhen.aliyuncs.com/acs/aliyun-ingress-controller:v0.44.0.3-8e83e7dc6-aliyun", "imageID": "docker-pullable://registry-vpc.cn-shenzhen.aliyuncs.com/acs/aliyun-ingress-controller@sha256:7238b6230b678b312113a891ad5f9f7bbedc7839a913eaaee0def8aa748c3313", "lastState": { "terminated": { "containerID": "docker://e9a6d774429c0385be4f3a3129860677a4380277bfe5647563ae4541335111a7", "exitCode": 255, "finishedAt": "2021-07-09T06:48:39Z", "reason": "Error", "startedAt": "2021-07-08T03:29:36Z" } }, "name": "nginx-ingress-controller", "ready": true, "restartCount": 1, "started": true, "state": { "running": { "startedAt": "2021-07-09T06:48:56Z" } } } ], ... ``` #### 2.查看网卡接口索引号 ```bash $ docker exec eda6562ab8d504599016d4dba8ceea4b9b255dbf97446f044ce89ad4410ab49a /bin/bash -c 'cat /sys/class/net/eth0/iflink' ``` ```bash 7 ``` #### 3.查看网卡接口索引对应的veth网卡 ```bash $ IFLINK_INDEX=7 #IFLINK_INDEX值为上面查询的结果 $ for i in /sys/class/net/veth*/ifindex; do grep -l ^$IFLINK_INDEX$ $i; done ``` ```bash /sys/class/net/vethfe247b7f/ifindex ``` #### 4.tcpdump抓包pod ```bash $ tcpdump -i vethfe247b7f -nnn ``` ```bash listening on vethfe247b7f, link-type EN10MB (Ethernet), capture size 262144 bytes 18:04:47.362688 IP 100.127.5.192.25291 > 10.151.0.78.80: Flags [S], seq 3419414283, win 2920, options [mss 1460,sackOK,TS val 2630688633 ecr 0,nop,wscale 0], length 0 18:04:47.362704 IP 10.151.0.78.80 > 100.127.5.192.25291: Flags [S.], seq 3153672948, ack 3419414284, win 65535, options [mss 1460,sackOK,TS val 1408463572 ecr 2630688633,nop,wscale 9], length 0 18:04:47.362857 IP 100.127.5.192.25291 > 10.151.0.78.80: Flags [.], ack 1, win 2920, options [nop,nop,TS val 2630688633 ecr 1408463572], length 0 ```
阅读 825 评论 0 收藏 0
阅读 825
评论 0
收藏 0