Membership Inference Attacks Against Time-Series ModelsNoam KorenAbigail Goldsteenet al.2024ACML 2024
MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt AttacksGiandomenico CornacchiaKieran Fraseret al.2024AIES 2024
On Robustness-Accuracy Characterization of Language Models using Synthetic DatasetsChing-yun KoPin-Yu Chenet al.2024COLM 2024
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic PromptsZhi-yi ChinChieh-ming Jianget al.2024ICML 2024
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised LearningZhiyuan HeYijun Yanget al.2024ICML 2024
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian BenchmarksIrene KoPin-Yu Chenet al.2024ICML 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven ArgumentationTomas Bueno MomcilovicBeat Buesseret al.2024xAI 2024
Improving Membership Inference Attacks against Classification ModelsShlomit ShachorNatalia Razinkovet al.2024KES-IDT 2024
Overload: Latency Attacks on Object Detection for Edge DevicesErh-Chung ChenPin-Yu Chenet al.2024CVPR 2024
Advancing the Robustness of Large Language Models through Self-Denoised SmoothingJiabao JiBairu Houet al.2024NAACL 2024