Caspian: A Carbon-aware Workload Scheduler in Multi-Cluster Kubernetes EnvironmentsTayebeh BahreiniAsser Tantawiet al.2024MASCOTS 2024
Optimizing GPU Multiplexing for Efficient and Cost-Effective Access to Diverse Large Language Models in GPU ClustersYue ZhuChen Wanget al.2024MASCOTS 2024
Best-Effort Power Model Serving for Energy Quantification of Cloud InstancesSunyanan ChoochotkaewTatsuhiro Chibaet al.2024MASCOTS 2024