在 Kubernetes1.16集群中,部署metrics-server-0.3.6,发现一直没有数据出现,不出现cpu、内存等利用率核心指标。
1.安装metric-server
a.获取部署文件
https://github.com/kubernetes-incubator/metrics-server/
wget && git clone 都可以
b.应用部署文件
cd metrics-server-0.3.6/deploy
#我这里集群是kubernetes1.16 大于1.8所以要应用1.8+文件夹
kubectl apply -f 1.8+/
2.遇见的问题
使用 kubectl top nodes,返回的永远都是 error: metrics not available yet
通过 kubectl logs metricxxx -n kube-system查看日志
a.坑1
unable to fetch metrics from Kubelet node21 (node21): Get https://node21:10250....
上面这个原因是因为集群使用kubeadm,节点使用自签发的ssl证书,没有使用TLS Bootstrap。所以在所有节点的kubelet配置文件中增加:
systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2019-11-04 15:34:23 CST; 6s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 10395 (kubelet)
Tasks: 15
Memory: 17.6M
CPU: 416ms
CGroup: /system.slice/kubelet.service
└─10395 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml 省略
得到config路径
/var/lib/kubelet/config.yaml
增加
serverTLSBootstrap: true
root@node21:/var/lib/kubelet# systemctl daemon-reload
root@node21:/var/lib/kubelet# systemctl restart kubelet.servic
即可!
重启 Kubelet,会发现出现了新的 CSR:
kubectl get csr
xxx
然后使用
kubectl certificate approve xxx
接收证书签发请求,将多个计算节点证书重签发一次。即可!
b.坑2
修复好上面这个bug后,数据一致还是没出现,日志发现有如下提示:
E1104 08:05:44.8 reststorage.go:135
unable to fetch node metrics for node "node21": no metrics known for node
通过issues发现需要在metric-server执行参数中增加:
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
args: #增加
- --kubelet-insecure-tls #增加
- --kubelet-preferred-address-types=InternalDNS,InternalIP,ExternalDNS,ExternalIP,Hostname #增加
重写应用yaml,生效后几分钟可以访问得到数据:
kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node20 355m 17% 1300Mi 68%
node21 207m 5% 1562Mi 19%
node22 512m 12% 2521Mi 31%
node23 449m 11% 1631Mi 23%
查看pod资源情况
kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
calico-kube-controllers-6d85fdfbd8-8vsm7 3m 19Mi
calico-node-47h4w 36m 43Mi
calico-node-g4278 34m 43Mi
calico-node-mmh7k 37m 67Mi
calico-node-tnfkb 35m 42Mi
coredns-5644d7b6d9-sbxx7 4m 18Mi
coredns-5644d7b6d9-tfnj8 5m 21Mi
etcd-node20 24m 82Mi
kube-apiserver-node20 93m 585Mi
kube-controller-manager-node20 18m 69Mi
kube-proxy-49cv9 1m 21Mi
kube-proxy-9zb5z 1m 21Mi
kube-proxy-t266f 1m 23Mi
kube-proxy-x5pvt 2m 21Mi
kube-scheduler-node20 2m 27Mi
metrics-server-8779b8f8b-qnqj6 3m 18Mi
tiller-deploy-77855d9dcf-c775s 1m 17Mi
用到的一些命令:
kubectl top nodes
kubectl top pod
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/pods"
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"