Abstract
Microservice applications are commonly deployed alongside other services to enhance resource utilization. However, this practice also leads to notable resource contention. While existing studies primarily focus on scaling critical microservices responsible for performance degradation to mitigate violations of SLAs regarding end-to-end latency in highly interfered environments, they often overlook the potential advantages of scaling non-critical microservices for optimized resource efficiency.
In this paper, we introduce Grad, an intelligent microservice scaling framework by harnessing resource fungibility between critical and non-critical microservices. Addressing the challenges posed by the dynamic nature of resource fungibility during scaling, Grad incorporates three key components. First, Grad employs a modular learning approach to profile individual microservice latency in relation to environmental conditions. Utilizing gradient extracts from this profile, Grad designs a scalable optimization module to dynamically select the optimal set of microservices for scaling. To rapidly mitigate SLA violations, Grad also deploys an accurate end-to-end latency predictor, serving as an simulator to obtain real-time feedback. We evaluate Grad in our cluster using real microservice benchmarks and production traces, demonstrating its ability to reduce resource usage by 49.1% and lower the probability of SLA violations by 3.7 × when compared to state-of-the-art solutions.
Materials
PDF available here
Cite this work
@INPROCEEDINGS{10946765, author={Chen, Liao and Lin, Chenyu and Luo, Shutian and Xu, Huanle and Xu, Chengzhong}, booktitle={2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)}, title={Grad: Intelligent Microservice Scaling by Harnessing Resource Fungibility}, year={2025}, volume={}, number={}, pages={474-486}, keywords={Degradation;Accuracy;Microservice architectures;Production;Computer architecture;Benchmark testing;Dynamic scheduling;Real-time systems;Resource management;Optimization;resource fungibility;microservice scaling}, doi={10.1109/HPCA61900.2025.00044}}