Conference paper

CPU-Limits kill Performance: Time to rethink Resource Control

Abstract

Management of compute resources for cloud-native microservices rely heavily on autoscalers to deal with time-varying workloads. Autoscalers are typically built atop the fundamental mechanism of adjusting CPU limits—restricting the amount of CPU resources a service is allowed to use. The autoscalers then innovate scaling policies within the constraints of such limits (e.g., by setting or changing limits). We show that such usage of CPU limits causes resource wastage and SLO violations (Service Level Agreements) in autoscalers, and essentially complicates their design. In fact, we find that many administrators disable limits altogether, thus defeating the original purposes and leaving systems vulnerable. Our findings open up new opportunities to design an entirely new class of limitless autoscalers—these set resource allocations in flexible ways, without using limits, in order to automatically satisfy latency SLOs in multi-tenant clusters. (change properties we observe last sentence as you see fit