Yuya Jeremy Ong, Jay Pankaj Gala, et al.
IEEE CISOSE 2024
In this paper, we investigate how concept-based models (CMs) respond to out-of-distribution (OOD) inputs. CMs are interpretable neural architectures that first predict a set of high-level \textit{concepts} (e.g., \texttt{stripes}, \texttt{black}) and then predict a task label from those concepts. In particular, we study the impact of \textit{concept interventions} (i.e.,~operations where a human expert corrects a CM’s mispredicted concepts at test time) on CMs' task predictions when inputs are OOD. Our analysis reveals a weakness in current state-of-the-art CMs, which we term \textit{leakage poisoning}, that prevents them from properly improving their accuracy when intervened on for OOD inputs. To address this, we introduce \mbox{MixCEM}, a new CM that learns to dynamically exploit leaked information missing from its concepts only when this information is in-distribution. Our results across tasks with and without complete sets of concept annotations demonstrate that MixCEMs outperform strong baselines by significantly improving their accuracy for both in-distribution and OOD samples in the presence and absence of concept interventions.
Yuya Jeremy Ong, Jay Pankaj Gala, et al.
IEEE CISOSE 2024
Masaki Ono, Takayuki Katsuki, et al.
MIE 2020
Daiki Kimura, Naomi Simumba, et al.
AGU Fall 2023
Radu Marinescu, Junkyu Lee, et al.
NeurIPS 2024