kubeadm 部署 kubernetes:v1.23.4集群

原创已于 2023-10-10 12:27:00 修改 · 4.7k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#kubernetes #docker #运维 #云原生 #容器

于 2022-03-16 12:23:26 首次发布

k8s_docker 同时被 2 个专栏收录

122 篇文章

订阅专栏

运维

46 篇文章

订阅专栏

本文档详细介绍了在CentOS系统上使用kubeadm搭建Kubernetes集群的步骤，包括关闭selinux和防火墙、配置hosts、禁用swap、时间同步、内核参数调整、安装docker和kubernetes组件、配置镜像加速、初始化集群、添加节点、安装网络插件等。同时，针对初始化报错和网络插件安装失败的问题提供了解决方案。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、安装前的准备
！！！以下操作需要在所有master和node上执行
1.1、关闭selinux，关闭防火墙
1.2、添加hosts解析

192.168.122.160  master
192.168.122.161  node1
192.168.122.162  node2

1.3、关闭swap
swap的作用类似Windows系统下的“虚拟内存”

swapoff -a && sed -i 's/.*swap.*/#&/' /etc/fstab

1.4、时间同步

yum install -y ntpdate
ntpdate time.windows.com

1.5、调整内核
内核调整,将桥接的IPv4流量传递到iptables的链
配置系统内核参数使流过网桥的流量也进入iptables/netfilter框架中

echo > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sysctl -p

二、使用kubeadm安装kubernetes集群
2.1、安装kubernetes和docker的yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
 
name=Kubernetes repo
 
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
 
gpgcheck=0
 
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
 
enabled=1

EOF

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -P /etc/yum.repos.d/

2.2、安装docker kubelet kubeadm kubelectl

yum install docker kubelet kubeadm kubectl -y       # 默认安装最新版本kubernetes

说明：安装指定版本操作如下
yum  install kubelet-1.23.4-0 kubeadm-1.23.4-0 kubectl-1.23.4-0 -y

2.3、配置国内加速镜像并启动docker

tee /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://dlbpv56y.mirror.aliyuncs.com"]
}
EOF

systemctl enable docker && systemctl start docker

2.4、启动kubelet

systemctl enable kubelet && systemctl start kubelet

三、创建kubernetes集群
3.1、初始化集群（master节点执行）

kubeadm init --apiserver-advertise-address=192.168.122.160 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.23.4 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16

3.2、输出如下内容，根据提示操作即可

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.122.160:6443 --token ekz7cv.5t8w82k1pp8zv00r \
	--discovery-token-ca-cert-hash sha256:e2022c1ecb0d773ae27ea463232cb1da3325bb329269b668c75836e941efaebe \
	--ignore-preflight-errors=Swap,NumCPU

3.3、执行以下内容即可将node节点加入集群

kubeadm join 192.168.122.160:6443 --token ekz7cv.5t8w82k1pp8zv00r \
	--discovery-token-ca-cert-hash sha256:e2022c1ecb0d773ae27ea463232cb1da3325bb329269b668c75836e941efaebe

3.4、安装网络插件

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

3.5、查看集群状态

[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   16h   v1.23.4
node1    Ready    <none>                 15h   v1.23.4
node2    Ready    <none>                 15h   v1.23.4

3.6、测试集群

[root@master ~]# kubectl get pod,svc
NAME                         READY   STATUS    RESTARTS   AGE
pod/nginx-54b478dd8f-7prdg   1/1     Running   1          100m

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
service/kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP          16h
service/nginx        NodePort    10.106.236.18   <none>        8080:31271/TCP   99m

3.6.1、使用curl命令测试

[root@master ~]# curl 10.106.236.18:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

四、token24小时后失效解决方法
4.1、重新生成token
4.1.1、方法1：

[root@master ~]# kubeadm token create
0w3a92.ijgba9ia0e3scicg
 
[root@master ~]# kubeadm token list
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
0w3a92.ijgba9ia0e3scicg   23h       2019-09-08T22:02:40+08:00   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
t0ehj8.k4ef3gq0icr3etl0   22h       2019-09-08T20:58:34+08:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token
 
[root@k8s-master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
ce07a7f5b259961884c55e3ff8784b1eda6f8b5931e6fa2ab0b30b6a4234c09a
 
然后加入集群
kubeadm join 192.168.122.160:6443 --token 0w3a92.ijgba9ia0e3scicg \
    --discovery-token-ca-cert-hash sha256:ce07a7f5b259961884c55e3ff8784b1eda6f8b5931e6fa2ab0b30b6a4234c09a

4.2.1、方法2：

[root@master ~]# kubeadm token list      # 没有列出任何信息，说明token已失效，需要重新生成
# 获取具有时限的token的kubeadm join命令
[root@master ~]# kubeadm token create --print-join-command
kubeadm join 192.168.122.160:6443 --token 7k6yen.gs4p3v8ojf6ahxjy --discovery-token-ca-cert-hash sha256:e2022c1ecb0d773ae27ea462278cb1da3325aa329269b668c75836e941efaebe 
# 获取永久的token的kubeadm join命令
[root@master ~]# kubeadm token create --ttl 0 --print-join-command
kubeadm join 192.168.122.160:6443 --token 7k6yen.gs4p3v8ojfsf4xjy --discovery-token-ca-cert-hash sha256:e2022c1ecb0d773ae27ea462278cb1da3325aa329249b668c7523424e941efaebe

五、部署遇到的坑及解决办法
5.1、初始化报错

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

5.1.2、解决方法
5.1.2.1、修改docker配置文件

[root@master ~]# vim /etc/docker/daemon.json
...
"exec-opts": ["native.cgroupdriver=systemd"],
...

5.1.2.2、重启docker并重新初始化集群

[root@master ~]# systemctl restart docker
[root@master ~]# kubeadm reset -f
[root@master ~]# kubeadm join 192.168.122.160:6443 --token ekz7cv.5t8w82k1pp8zv00r \
	--discovery-token-ca-cert-hash sha256:e2022c1ecb0d773ae27ea462278cb1da3325aa329269b668c75836e941efaebe

5.2、安装calico网络插件报错

  Warning  FailedCreatePodSandBox  11s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "03779f841452fd3edd9df4f73944d97c3b745091f2c112c7a3896d557a03fc5a" network for pod "calico-kube-controllers-784dcb7597-bmlsm": networkPlugin cni failed to set up pod "calico-kube-controllers-784dcb7597-bmlsm_kube-system" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
  Normal   SandboxChanged          8s (x12 over 21s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  7s (x4 over 10s)   kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "4c4100e00aa029744dada2757c33ac5e98a1231cb1f8d0c61fae64561d18471c" network for pod "calico-kube-controllers-784dcb7597-bmlsm": networkPlugin cni failed to set up pod "calico-kube-controllers-784dcb7597-bmlsm_kube-system" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/

5.2.1、解决方法
5.2.1.1、删除/var/lib/calico这个目录和/etc/cni/net.d/这个目录下的calico文件即可
5.2.1.2、重新安装 flannel 网络插件

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

六、启动测试pod
6.1、临时启动一个pod用于测试

kubectl run dns-test -it --rm --image=busybox:1.28.4 -- sh
kubectl run busybox --image busybox:1.28 --restart=Never --rm -it busybox -- sh
nslookup kubernetes.default.svc.cluster.local   # 执行nslookup解析域名