Publications
Filter by
Open menu
1 result for
Idan Shenfeld
Curiosity-driven Red-teaming for Large Language Models
Zhang-wei Hong
Idan Shenfeld
et al.
2024
ICLR 2024