Publication
CLOUD 2024
Conference paper

S2TAR-Cloud: Shared Secure Trusted Accelerators with Reconfiguration for Machine Learning in the Cloud

View publication

Abstract

The demand for hardware accelerators such as Tensor Processing Units (TPUs) and Graphics Processing Units (GPUs) is rapidly increasing due to growing Machine Learning (ML) workloads. As with any shared computing resources, there is a growing need to dynamically adjust and scale accelerator services while ensuring data privacy and confidentiality, especially in cloud environments. We propose a secure and reconfigurable TPU design with confidential computing support, achieved through a Trusted Execution Environment (TEE) framework tailored for reconfigurable TPU in a multi-tenant cloud. Our contributions include a novel TPU design based on switchbox-enabled systolic arrays to support rapid dynamic partitioning. We evaluate our TPU design with TEEs in shared environments, achieving up to 42.1 % higher performance for realistic ML inference workloads. Our remote attestation protocol extends to sub-device partitions, providing trustworthiness on a fine-grained level and decouples host and accelerator TEEs into separate attestation reports without degrading security guarantees. Our work presents a new TEE framework for secure and reconfigurable ML accelerators in a multi-tenant cloud environment.

Date

Publication

CLOUD 2024