Skip to main content

K0S FAQ

kubeconfig

cat /var/lib/k0s/pki/admin.conf

手动下载安装

ver=$(curl https://docs.k0sproject.io/stable.txt)
curl -LO "https://ghproxy.com/https://github.com/k0sproject/k0s/releases/download/${ver}/k0s-${ver}-amd64"
chmod 755 k0s-${ver}-amd64
cp $(which k0s) k0s.last
sudo mkdir -p /usr/local/bin/
sudo cp k0s-${ver}-amd64 /usr/local/bin/k0s

curl -LOC- "https://ghproxy.com/https://github.com/k0sproject/k0s/releases/download/${ver}/k0s-airgap-bundle-${ver}-amd64"
mkdir -p /var/lib/k0s/images
sudo cp k0s-airgap-bundle-${ver}-amd64 /var/lib/k0s/images/bundle_file

sudo k0s sysinfo

Control plane vs. Worker

  • Control plane
    • 作为其他组建的 守护进程/supervisor - 不调度工作负载
    • 不需要容器引擎、kubelet
    • kubectl get node 看不到
    • 维护 kine/etcd, api-server, scheduler, controller-manager, konnectivity-server, k0s-api
    • Control Plane High Availability
  • Worker
    • 运行 kubelet
    • 依赖 cri - 默认你 containerd+runc
    • 调度工作负载

konnectivity-server.sock: socket: too many open files

检查 ulimit

cat /proc/$(pidof konnectivity-server)/limits
# 当前打开文件数
ls /proc/$(pidof konnectivity-server)/fd | wc -l

cgroups: cgroup deleted: unknown

Failed to run kubelet failed to run Kubelet: mountpoint for cpu not found

ls /sys/fs/cgroup
service cgroups start

cni plugin not initialized

false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

检查 kube-proxy pod 是否启动

no IP ranges specified

kube-router Failed to watch i/o timeout

Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: i/o timeout

k0s controller --single vs --enable-worker

  • --single
    • 减少部分 leader 选取逻辑
    • 不可以添加 worker
    • etcd 默认使用 kine - 可配置使用 etcd
  • --enable-worker
    • 正常节点
    • 可后续添加 worker

konnectivity-server.sock: bind: address already in use

如果没有被使用,则可以安全删除

sudo lsof -f -- /run/k0s/konnectivity-server/konnectivity-server.sock
unlink /run/k0s/konnectivity-server/konnectivity-server.sock

如果被使用了,则考虑之前的 process 是否未正常退出,kill 即可。

failed to get initial kubelet config with join token: failed to get kubelet config from API: Unauthorized"

  • 启动失败,停止,再次启动时可能发生
  • kubelet 的 token 有时效,可能已经过期了,可删除重启,会重新创建
k0s stop
rm /var/lib/k0s/kubelet-bootstrap.conf
k0s start

coredns fatal plugin/loop: Loop detected for zone

  • 循环 resolve 检测 - 一般是 /etc/resolv.conf 配置了 127.0.0.1 导致
  • 修改 /etc/resolv.conf 然后删除 pod 即可
  • https://coredns.io/plugins/loop/

kill all

k0s stop
# 应该 pkill 杀掉进程树
# killall /var/lib/k0s/bin/containerd-shim-runc-v2

网络配置问题

因为服务器有多 lan,配置了路由策略,从 10gbe 进来的从 10gbe 出去

auto eth4
iface eth4 inet static
address 192.168.1.10
netmask 255.255.252.0
gateway 192.168.1.1
mtu 9000
pre-up ip ro li tab tgbe &>/dev/null || echo '10 tgbe' >> /etc/iproute2/rt_tables
post-up ip ru add from 192.168.1.10 table tgbe
post-up ip ro add default via 192.168.76.1 dev eth4 table tgbe

导致 kube 无法自身访问到 api,因为不再一个路由表里,无法从 192.168.1.10 访问到 10.96.0.1。

移除规则后则一切恢复正常。

ip ru del from 192.168.1.10

Error while reading from Writer: bufio.Scanner: token too long component=kubelet

ps aux | grep keubelt
  • k0sproject/k0s#2669
    • fix v1.25.6+
  • 有可能 configmap 太大导致
  • k0s v1.23 - 重启后无法再启动
  • 系统重启后解决
    • 再次重启 k0s 出现相同问题

手动启动

sudo k0s controller --config=/etc/k0s/k0s.yaml --enable-dynamic-config=true --enable-worker=true

failed to run kube-router: Failed to create network routing controller: failed to determine if ipset set kube-router-pod-subnets exists: ipset v7.15: Kernel error received: Invalid argument

failed to reserve container name is reserved for