-
Notifications
You must be signed in to change notification settings - Fork 1
Auto scaling
Automated scaling of the computational resources DiSSCo uses is an essential part of the architecture. Within DiSSCo we expect fluctuations in the usage of the system. During weekdays, when most users are active and are requesting Machine Annotation Services, utilization will be high. However, in weekends, utilization might be significantly lower. By creating a flexible computational platform, DiSSCo's resource consumption can easily scale up and down based on request. This has many advantages:
- Less wait time for users as we can scale up on high request
- No overconsumption of resources, leading to less energy consumption
- No overconsumption of resources, lowering overall cost of the infrastructure
- Regularly restarting server improves security and efficiency of machines
The automated scaling of the cluster contains two components:
- Application scaling
- Infrastructure scaling
For scaling the application we use a tool called Kubernetes Event Driven Architecture (KEDA). KEDA provides a simple tool uses so-called "Scalers" to check if an application needs to be run and how many instances are required. Within DiSSCo we use it in combination with Kafka, an event streaming platform. In most instances, KEDA will check a Kafka topic to see if there are any messages. If there are no messages, it will scale the consuming application to 0 instances. When an event has been published to the queue, KEDA will recognize this message needs to be processed and the application is scaled to 1 instance. If there are a lot of events on the queue, KEDA can further scale the application to an X number of instances. When the event has been processed and the Kafka topic has no more messages, KEDA will, after a couple of minutes, scale the application back to 0. This simple but effective way to scale our application up and down enables DiSSCo to scale based on request. It can handle and process large amount of events by automatically scaling up and scales back down to conserve resources.
KEDA is deployed using ArgoCD with the Helm chart. An example of a KEDA Scaler can be found for example in the digital-specimen-processor
Kube-green is a Kubernetes add-on that automatically shuts down resources according to a set schedule. Its intention is to reduce wasted compute power and carbon emissions. In DiSSCo, we use kube-green to automatically scale down most of the test environment outside working hours (8:00-18:00 Monday-Friday).
Besides scaling the application we also need to automatically scale the underlying infrastructure, the virtual machines supporting our cluster. Application can scale down to 0 but if the machines keep running we won't reap the benefits. To automatically scale the infrastructure, we use Karpenter. Karpenter continously checks if there are applications that doesn't fit the cluster. It does this based on the resource requests of the application. Each application should have a resources section in the Kubernetes yaml. In this section, we can define what the expected resource usage of the application will be. Based on this expected usage, Kubernetes checks if it can deploy the application on one of the existing nodes. If this is not possible, Karpenter will recognize this and will add additional resources to the cluster. When there is under utilization of the cluster, there is more compute than needed, Karpenter will also recognize this and scale down. Karpenter will select the EC2 instance which fit best with the applications deployed.
It is important to note that using an auto-scaling cluster does mean applications will restart often. To remove downtime between the restarts it is vital that the liveliness and health endpoints are used. Traffic should only be routed to a newly started application when the application is ready.
Besides nodes managed by Karpenter we also have two nodes in the cluster managed by a managed node group. Theses are added on startup and are stable. The two nodes in the managed group can be used to deploy applications that are vital to the functioning of the cluster such as the DNS or the Karpenter application itself.
Karpenter is deployed by ArgoCD using a Helm chart. A NodePool object is then added to configure Karpenter, this is the current configuration used.