etcd CPU limit customization
Categories: programming
Tags: kubernetes gagana

Had a Prometheus alarm triggered a host was at a high system load. Looking at my Beelink Mini S12 Pro
turns out the root drive is an M.2 SSD instead of an NVMe drive. Being a control plane node, etcd among the other
workloads is saturating the disk.
It is acceptable for other workloads to slow down, however etcd gets grumpy about disk saturation. To resolve this I
would like to customize the etcd static pod on this node to boost resources.request.cpu. Based on the 1.35 k8s docs,
LocalEtcd
does not support modifying the static pod spec.
Kubeadm issue #2195 is an actual discussion regarding this. Effectively, the default value is setup for a machine with 2 vCPUs and hardwired to 100mCPU. They recommend users manually override the static pod spec. Time to give it a whirl!
After writing the updating the static pod manifest at /etc/kubernetes/manifests/etcd.yaml the pod automatically
updated. Taking approximately 2 minutes to restart this brought the control plane down for a brief period of time.
etcd was able to catch up and the cluster was back online. This may be a result of this instance no longer being the
leader node of the cluster; however, overall system load has dropped significantly.
Longer term replacing the root drive with an NVMe drive may be the best solution. However it appears like the secondary slot on the BeeLink Mini S12 Pro is an M.2 may not support NVMe. Meaning I would need to rebuild the entire system.