Stability in Kubernetes Nodepool: The Challenge with Cronjobs
Cronjobs in Kubernetes can destabilize node pools. Optimization strategies are applied, but there remain open issues. How does one deal with this?
11 May 2023
In a Kubernetes environment, the container principle no longer allows us to use crontab for recurring processes. Instead, there are specially created Cronjob Deployments for this purpose, which work well as long as there are only a few Cronjobs that want to start at the same time. However, we have Cronjobs in the double-digit range that all want to start at the same time every minute. This can lead to nodes becoming unstable. So how can we remedy this?
Our SaaS product runs in a Kubernetes cluster. We have separate Deployments for each of our customers (which we call Workspaces), and each of these Workspaces has three Cronjobs:
One at night
One at noon
One every minute
In total, this results in >100 Deployments in a Nodepool. Our PROD Nodepool has three Nodes. In principle, this should not be a problem, as our Deployments do not consume many resources, neither in CPU nor in RAM. However, the Cronjobs every minute create unrest in the Nodes. When a double-digit number of Cronjobs start at the same time to do their work, it can disrupt a Node and affect the other containers running on it. At first glance, the minute Cronjob looks harmless:
The Cron jobs do not reserve too many resources and they only start two Symfony commands, which usually do nothing at all. However, we still receive many error messages that Cron jobs could not be started, or that deployments needed multiple attempts to start. We also see gaps in the load of the nodes. The nodes are unreachable for the cluster and cannot be monitored. So what do we do?
Optimization 1: Custom Node Pool for Cronjobs
So, we have given the cronjobs their own node pool, separating them from all other applications. We did this using NodeSelectors. NodeSelectors ensure that containers are only deployed where they are allowed to.
They react to the labels of the node pools. So, we set a label "Cronjob: allowed" on a new node pool. And the cronjobs have received the following NodeSelector.
The cronjobs are only created on nodes that have the label "cronjob: allowed". Otherwise, nothing runs there. Why? Because we have given all of our applications NodeSelectors and thus force them to run on their assigned node pools. One node pool for the cronjobs, one for the customers, and one for the applications, all of which need things like IAM.
This helped initially and eased our conscience. But it quickly became apparent that this didn't really help. The nodes in the node pool for the cronjobs still had the same problems and were unstable. The only thing that helped was to clean up these nodes, delete them, and let the node pool automatically generate new nodes.
To avoid manual work, we had to think of something else. It was obvious. We have to do away with these cronjobs!
Optimization 2: No more cronjobs!
Not having any more cronjobs would be nice. Unfortunately, we haven't found a solution that would have abolished the cronjobs. I would have liked to tell you that a cronjob "x" is no longer needed. But we still need it. We have found a solution for ourselves, so that the cronjobs no longer need to be managed by the cluster, and thus there is no longer any instability. We have a central application that knows all workspaces. We have had this central management from the beginning. Now we let all workspaces notify this management that a cronjob is to be executed. Each workspace has an API, which then starts the mentioned cronjob in the shell. This puts more load on the workspace, but usually nothing happens, because there is nothing to do. We have achieved the same effect as with separate cronjob deployments. This way, we avoid too much load on our cluster by constantly starting up cronjobs.
Even more potential for optimization
Unfortunately, this solution also has a catch. It only works as long as we have fewer than 300 customers: A request to a workspace to start a cronjob takes about 200ms. Thus, we can start up to 300 cronjobs within a minute. After that, the action of starting all cronjobs takes longer than a minute, and we have a lag, and a cronjob no longer runs every minute, but somewhat less often.
But that can also be controlled... you can run the requests to the workspaces in parallel processes. But we'll write about that another time.
Are we alone with unstable nodes due to cronjobs?
How do you handle cronjobs when many applications want to start a cronjob at the same time?
Let's stay in touch