Context
The node family (t2d-standard-1
) used is configured with:
- Memory: 4GB
- vCPU: 1
Architecture concept
Without the placeholder/balloon technique
As you can see from this animation.
The problem we encounter is mainly the arrival of the new node (Node 2
and Node 3
).
In short, when a node is fully occupied (Node 1
), a new node is created (Node 2
and Node 3
), accepting new pods only when their scale requires it (Pod 5
and Pod 6
).
With the placeholder/balloon technique
In our case, we use placeholders or balloons to prepare for the arrival of new pods.
In this way, we ensure that we can accommodate them.
The placeholder pods will ensure that there is sufficient availability for the load of the others pods.
The priority of work pods will take precedence over that of placeholder pods.
To do this, we use two Kubernetes PriorityClass
resources.
Implementation
Creation of priority class resources
To implement this solution, we need to create two [PriorityClass
] resources (https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/).
The first will control placeholder/balloon pods.
|
|
Read more about the the field set to -10
in Google Kubernetes Engine (GKE) autopilot context usage.
Read more about the field preemptionPolicy
to Never
.
In our case, the globalDefault
field is disabled.
The second will be used for classic default global workload pods.
|
|
Example of workload
When you want to deploy your workloads, you can do it this way.
As you can see, we explicitly pass the default-priority
value for the priorityClassName
field.
In this way, these workloads take precedence over placeholders/balloons, and will replace them when needed.
|
|
Read more about field preemptionPolicy
to PreemptLowerPriority
.
Creation of placeholder/balloon deployment resources
Here’s how we’ll configure the deployment to have 3 placeholders.
The most important part here is the priorityClassName
field, configured with the placeholder-priority
value from our PriorityClass
resource seen earlier.
|
|
Note the importance of always always defining the requests and limits of your resources.
Conclusion
As you can see, this system is very simple to set up.
The advantage is that you can use this system in managed solutions such as Google Kubernetes Engine (GKE) in Autopilot operating mode when you’re not managing nodes.