This is one of the posts I have written for Loft Labs, Inc..
One of the most powerful features of Kubernetes is autoscaling, as it’s vital that we find the correct balance when scaling resources in our infrastructures. Scale up more than needed, and you will have unused resources which you must pay for. Scale down more than required and your application will not be performant.
Kubernetes brings three types of auto-scaling to the table:
- Cluster Autoscaler
- Horizontal Pod Scaler
- Vertical Pod Scaler
The Cluster Autoscaler scales the nodes up/down depending on the pod’s CPU and memory requests. If a pod cannot be scheduled due to the resource requests, then a node will be created to accommodate. On the other side, if nodes do not have any workloads running, they can be terminated.
The Horizontal Pod Autoscaler scales the number of pods of an application based on the resource metrics such as CPU or memory usage or custom metrics. It can affect replication controllers, deployment, replica sets, or stateful sets. Custom metrics and external metrics are supported, so they can be used by another autoscaler within the cluster as well.
The Vertical Pod Scaler is responsible for adjusting requests and limits on CPU and memory.