Localizing Persona Representations in LLMs
- 2025
- AIES 2025
HI! I am a Research Scientist at IBM Research Africa and a last-year PhD student at Saarland University, Germany. I am interested in trustworthy ML, especially interpretability of large generative models and regulation. In my PhD I focus on algorithmic fairness and feedback loops.
A tale of adversarial attacks & out-of-distribution detection stories in the activation space