Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
In this paper, we present several algorithms for performing all-to-many personalized communication on distributed memory parallel machines. We assume that each processor sends a different message (of potentially different size) to a subset of all the processors involved in the collective communication. The algorithms are based on decomposing the communication matrix into a set of partial permutations. We study the effectiveness of our algorithms from both the view of static scheduling and runtime scheduling. © 1995 Academic Press, Inc.
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
Aditya Malik, Nalini Ratha, et al.
CAI 2024
Hannaneh Hajishirzi, Julia Hockenmaier, et al.
UAI 2011
Arnold.L. Rosenberg
Journal of the ACM