ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training Chia-Yu ChenJiamin Niet al.2020NeurIPS 2020