DEFT: SLO-Driven Preemptive Scheduling for Containerized DNN ServingYitian HaoWenqing Wuet al.2023NSDI 2023
Towards Optimal Preemptive GPU Time-Sharing for Edge Model ServingZhengxu XiaYitian Haoet al.2023MIDDLEWARE 2023