Available in VPC
Kubernetes provides two features for cluster autoscaling and resource management: Cluster Autoscaler and Vertical Pod Autoscaler. Cluster Autoscaler automatically expands or reduces worker nodes in a cluster. This allows for more flexible and effective cluster resource management.
This guide walks you through the features and usage of Cluster Autoscaler in Ncloud Kubernetes Service.
Cluster Autoscaler
Cluster Autoscaler automatically adjusts the size of a Kubernetes cluster to optimize resource usage. When there is a pod that cannot be scheduled due to a lack of resource, the cluster is expanded. If some of the nodes in the cluster constantly show low resource usage, those nodes are reduced to prevent resource waste. This process is automatically performed based on the requirements from the pods and the current resource status of the cluster.
To operate Cluster Autoscaler, Pod resource requests and limits should be defined in advance. This information is crucial for Cluster Autoscaler to analyze the resource usage rate of the cluster and adjust the size of the cluster as necessary. Resource requests specifies the minimum resource necessary for a pod to start, and resource limits restricts the maximum resource available for the pod.
Operation of the Cluster Autoscaler
Cluster Autoscaler analyzes the resource usage rate of a cluster and adjusts the size of the cluster as necessary. This process is automatically performed based on the requirements from the pods and the current resource status of the cluster.
- Scaleup conditions
- If the newly created pod has been in Pending status for more than a certain amount of time, and the maximum number of nodes in the current cluster and the maxSize limit of the node pool have not yet been reached, nodes are added.
- Cluster Autoscaler determines whether to add nodes based on resource requests (Requests) from the pending pod.
- Scaledown conditions
- If some of the nodes in the cluster constantly show low resource usage, those nodes are reduced..
- A node is considered underutilized when the aggregate of CPU/Memory requests is less than the scale-down-utilization-threshold (default: 0.5) relative to the node's allocatable resources.
- By default, during --scale-down-unneeded-time (default: 10 minutes), if the node is not requested by other pods, it is registered as a candidate for reduction.
- Cluster Autoscaler then verifies that the pod can be safely moved to another node, and then removes the node.
- However, nodes are excluded from reduction in the following cases:
- When nodes are not controlled by a controller, such as Deployment and StatefulSet
- When Local Storage is enabled
- When the pod cannot move to another node
- When the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict" is set to "false"
For more information on Cluster Autoscaler, see Cluster Autoscaler FAQs
Enabling Cluster Autoscaler
For Ncloud Kubernetes Service clusters, Cluster Autoscaler is disabled by default. You can easily enable this function using the node pool settings on your console to automatically adjust the cluster size and optimize resource usage.
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Containers > Ncloud Kubernetes Service > Clusters. - Select the node pool for which you want to enable Cluster Autoscaler.
- Click [Edit] on the top.
- Click [Settings] and enter the minimum and maximum number of nodes.
- The total number of cluster worker nodes includes the maximum number of nodes in Cluster Autoscaler. Clusters should be appropriately managed by taking this factor into consideration.
- In the initial settings, the number of nodes are not automatically increased even if the current number of nodes is smaller than the minimum number of nodes previously set. If necessary, the number of nodes should be adjusted manually before the setup.
- Click [Edit] to enable Cluster Autoscaler.
- Use kubectl to check the operation of Cluster Autoscaler.
$ kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml ... data: status: | Cluster-autoscaler status at 2024-02-20 07:04:01.141727455 +0000 UTC: Cluster-wide: Health: Healthy (ready=5 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=5 longUnregistered=0) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 ScaleUp: NoActivity (ready=5 registered=5) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 NodeGroups: Name: node Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0 cloudProviderTarget=2 (minSize=1, maxSize=8)) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:13:49.540592255 +0000 UTC m=+2242341.682313956 ScaleUp: NoActivity (ready=2 cloudProviderTarget=2) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:25:11.920272102 +0000 UTC m=+2243024.061993805 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:25:11.920272102 +0000 UTC m=+2243024.061993805 ...
Consider the following when using Cluster Autoscaler.
- Starting or stopping the Cluster Autoscaler feature takes approximately 1 to 5 minutes.
- As the number of nodes cannot be manually changed while using Autoscaler, set the function to Not set to manually change the number of nodes.
- Cluster Autoscaler only applies to node pools within a cluster that have the feature enabled.
Cluster Autoscaler examples
Using Cluster Autoscaler and Horizontal Pod Autoscaler, you can conduct a load test in your Kubernetes cluster. The test allows you to check the dynamic adjustment of the number of pods and nodes in your cluster.
You can find more information on this example from Horizontal Pod Autoscaler Walkthrough.
-
Get the cluster ready for Cluster Autoscaler testing.
-
Check if the Cluster Autoscaler and Metrics Server are operating normally.
- Check the Cluster Autoscaler
$ kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml ... data: status: | Cluster-autoscaler status at 2024-02-20 07:04:01.141727455 +0000 UTC: Cluster-wide: Health: Healthy (ready=5 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=5 longUnregistered=0) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 ScaleUp: NoActivity (ready=5 registered=5) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 07:04:01.121214914 +0000 UTC m=+2227353.262936722 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 NodeGroups: Name: node Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0 cloudProviderTarget=2 (minSize=1, maxSize=8)) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:13:49.540592255 +0000 UTC m=+2242341.682313956 ScaleUp: NoActivity (ready=2 cloudProviderTarget=2) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:25:11.920272102 +0000 UTC m=+2243024.061993805 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 11:25:31.98194832 +0000 UTC m=+2243044.123670024 LastTransitionTime: 2024-02-20 11:25:11.920272102 +0000 UTC m=+2243024.061993805 ...- Check the Metrics Server
$ kubectl top pods -n kube-system NAME CPU(cores) MEMORY(bytes) cilium-mct9p 7m 96Mi cilium-operator-846758784c-t2hpj 2m 21Mi cilium-qqwdl 6m 88Mi cilium-sspl4 6m 88Mi ... -
Deploy php-apache deployment in the cluster.
$ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml --- apiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: selector: matchLabels: run: php-apache replicas: 1 template: metadata: labels: run: php-apache spec: containers: - name: php-apache image: registry.k8s.io/hpa-example ports: - containerPort: 80 resources: limits: cpu: 500m requests: cpu: 200m --- apiVersion: v1 kind: Service metadata: name: php-apache labels: run: php-apache spec: ports: - port: 80 selector: run: php-apache -
Create and check the Horizontal Pod Autoscaler (HPA).
- Create HPA
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10- Check HPA
$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 0%/50% 1 10 1 144m -
Generate a load to check an increase in the number of pods.
- Increase the load in a separate terminal
$ kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"- Observe HPA and confirm an increase in the number of replicas
$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 106%/50% 1 10 7 154m -
Check the Pending status of newly created pods and check the operation of the Cluster Autoscaler.
- Check the Pending status of pods
$ kubectl get pods -o wide | grep php-apache ... php-apache-78c9f8cbf6-4n794 0/1 Pending 0 20s php-apache-78c9f8cbf6-c8xdc 0/1 Pending 0 24s php-apache-78c9f8cbf6-t75w2 0/1 Pending 0 24s php-apache-78c9f8cbf6-wc98s 1/1 Running 0 24s ...- Check the operation of the Cluster Autoscaler
$ kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml ... Cluster-wide: Health: Healthy (ready=3 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=3 longUnregistered=0) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 ScaleUp: InProgress (ready=3 registered=3) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-02-20 11:14:29.611869206 +0000 UTC m=+2242381.753590773 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-01-25 12:22:06.170599081 +0000 UTC m=+38.312320847 NodeGroups: Name: node Health: Healthy (ready=1 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=2 (minSize=1, maxSize=8)) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-02-20 11:13:49.540592255 +0000 UTC m=+2242341.682313956 ScaleUp: InProgress (ready=1 cloudProviderTarget=2) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-02-20 11:14:29.611869206 +0000 UTC m=+2242381.753590773 ScaleDown: NoCandidates (candidates=0) LastProbeTime: 2024-02-20 11:15:50.82048015 +0000 UTC m=+2242462.962201862 LastTransitionTime: 2024-02-20 11:13:49.540592255 +0000 UTC m=+2242341.682313956 ... -
Check the expansion of worker nodes and the scheduling of pods.
- View worker node expansion
$ kubectl get nodes NAME STATUS ROLES AGE VERSION node-1 Ready <none> 13m v1.27.9 node-2 Ready <none> 2m8s v1.27.9 node-3 Ready <none> 88d v1.27.9 node-4 Ready <none> 15d v1.27.9- Check the Running status of pods
$ kubectl get pods | grep php-apache -
Once the load is stopped, the number of pods is reduced. Next, Cluster Autoscaler observes the resource usage of the worker nodes and reduces their number as necessary to reduce the size of the cluster.
- As the result may vary depending on the specifications of the nodes selected during cluster configuration, you should factor in the specifications before the test.
- Even after the load test is stopped, specific conditions may hinder immediate node reduction. For example, if an important system pod is running in the node or if there is a pod using a local storage, node reduction may be limited.
- As the expansion and reduction of the cluster using the load test may be different from the actual settings, further review of the test result is required prior to real-world application.
Troubleshooting
For issues you run into while using Cluster Autoscaler, see Ncloud Kubernetes Service troubleshooting - scalability and performance problems