- Print
- PDF
Node maintenance
- Print
- PDF
Available in Classic
Before use
Kubernetes is designed to be fault tolerant so as to continue operating normally or partially even if a worker node fails. The master node of Kubernetes periodically receives the health of infrastructures such as a virtual machine from worker nodes, and performs a failback if any of them has a failure. However, it can hardly cover all cases; in some cases, developers may intervene to gracefully shut down a pod running in a specific worker node for maintenance purposes.
This document describes how to drain applications running on a node to another one when you need to intentionally scale down Ready
nodes in a Kubernetes cluster due to server termination or shutdown for maintenance.
Misconception about Kubernetes
Common misconception about Kubernetes
If there are enough resources, Kubernetes can re-schedule all the pods on a failed node to another one, so you don’t have to worry about failures that may occur in worker nodes. Moreover, Cluster Autoscaler can add new worker nodes if needed, so applications and clusters will continue to run without any problem.
These are common misconceptions about Kubernetes. Even if the Kubernetes master node’s kube-controller-manager
detects that a worker node has a problem, it re-schedules pods on the node only when pod eviction conditions are met.
When you forcibly restart a running worker node, the scheduled pods on the worker node are not working during a specific time period (downtime) if the pod eviction conditions are not met. Therefore, before restarting a worker node, you can use the kubectl drain
command to evict the running pods from the worker node to another one (gracefully shut down pods) to stably run your service.
Check worker node status
The master node in Kubernetes periodically receives infrastructure health from worker nodes. Use the following command to get the status and details of a node. For more information, refer to Nodes.
$ kubectl --kubeconfig=$KUBE_CONFIG describe node $NODENAME
The conditions of a node are returned as a JSON object. The following response example shows the conditions of a normal node.
"conditions": [
{
"type": "Ready",
"status": "True",
"reason": "KubeletReady",
"message": "kubelet is posting ready status",
"lastHeartbeatTime": "2019-06-05T18:38:35Z",
"lastTransitionTime": "2019-06-05T11:41:27Z"
}
]
If the status of the Ready
type remains as Unknown
or False
for more than pod-eviction-timeout
that is passed as an argument to kube-controller-manager
of the master node, all the pods on a node are scheduled to be deleted by the node controller.
NAVER CLOUD PLATFORM’s Kubernetes Service uses 5 minutes for pod-eviction-timeout
.
Use kubectl drain
to evict pods from a worker node
Before working on a node, including kernel updates and infrastructure maintenance, you can use kubectl drain
to safely evict pods from the node. Safe eviction gracefully terminates containers of pods.
The kubectl drain
command ignores specific system pods that cannot be removed from a node. For more information, refer to kubectl drain.
If kubectl drain
is successfully executed, all the pods are safely evicted from a specific worker node. After that, you can stop, restart or terminate a virtual machine to safely stop the node.
① Check the node from which you want to evict pods. To get all worker nodes in the cluster, use the following command.
$ kubectl --kubeconfig=$KUBE_CONFIG get nodes
② Enter the name of the worker node from which you want to drain pods. This action keeps the worker node from scheduling pods and gracefully terminates all the pods from the node.
$ kubectl --kubeconfig=$KUBE_CONFIG drain $NODENAME
This command does not work if one of the following conditions occurs.
When the pods on the worker node use
emptyDir
to store local dataIf there is a pod using
emptyDir
to store local data on the worker node, thekubectl drain
command does not work because deleting this pod also deletes the data. If you have no problem with deleting the pod, execute thedrain
command with the--delete-local-data
flag.When a DaemonSet is running on the worker node
If DaemonSet pods are running on the worker node, the
kubectl drain
command does not work. Even if the node isunschedulable
, the DaemonSet controller can ignore it and schedule pods to the node. If there are DaemonSet pods, you can add the--ignore-daemonsets
flag to exclude them from those to evict.When there are pods that are not managed by Kubernetes controllers
If there are pods that are not managed by Kubernetes controllers such as Deployment, Statefulset, DaemonSet, ReplicaSet, and Job, on the worker node, the
kubectl drain
command does not work to protect those pods. In this case, you can usekubectl drain
with the--force
flag to remove those pods from the cluster and keep them from being re-scheduled.
Example
$ kubectl --kubeconfig=$KUBE_CONFIG drain $NODENAME --delete-local-data --ignore-daemonsets
③ If kubectl drain
is successfully executed without any error, you can restart your virtual machine or remove pods for maintenance purposes. After finishing maintenance jobs, run the following command to make the worker node re-schedulable. You do not need to run the command if you want to terminate the worker node.
$ kubectl --kubeconfig=$KUBE_CONFIG uncordon $NODENAME