Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language ModelsShengyun PengPin-Yu Chenet al.2024NeurIPS 2024
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation ModelsYuchen HuChen Chenet al.2024NeurIPS 2024
NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage RegimesHao-lun SunLei Hsiunget al.2024NeurIPS 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language ModelsChia-yi HsuYu-Lin Tsaiet al.2024NeurIPS 2024
GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative ModelsZhaitang LiPin-Yu Chenet al.2024NeurIPS 2024
Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMsMegh ThakkarYash Moreet al.2024NeurIPS 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAIAmbrish RawatStefan Schoepfet al.2024NeurIPS 2024
On Robustness-Accuracy Characterization of Language Models using Synthetic DatasetsChing-yun KoPin-Yu Chenet al.2024COLM 2024
SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active SensingChangchang YinPin-Yu Chenet al.2024KDD 2024
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment TechniquesMegh ThakkarQuentin Fournieret al.2024ACL 2024