On the Adversarial Robustness of Vision Transformers
Rulin Shao, Zhouxing Shi, et al.
NeurIPS 2022
This paper proposes a class of well-conditioned neural networks in which a unit amount of change in the inputs causes at most a unit amount of change in the outputs or any of the internal layers. We develop the known methodology of controlling Lipschitz constants to realize its full potential in maximizing robustness, with a new regularization scheme for linear layers, new ways to adapt nonlinearities and a new loss function. With MNIST and CIFAR-10 classifiers, we demonstrate a number of advantages. Without needing any adversarial training, the proposed classifiers exceed the state of the art in robustness against white-box L2-bounded adversarial attacks. They generalize better than ordinary networks from noisy data with partially random labels. Their outputs are quantitatively meaningful and indicate levels of confidence and generalization, among other desirable properties.
Rulin Shao, Zhouxing Shi, et al.
NeurIPS 2022
Jiayuan Mao, Chuang Gan, et al.
ICLR 2019
Minhao Cheng, Huan Zhang, et al.
ICLR 2019
Chia-Yi Hsu, Pin-Yu Chen, et al.
ICLR 2021