A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement LearningDong Ki KimMiao Liuet al.2021ICML 2021
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment TechniquesMegh ThakkarQuentin Fournieret al.2024ACL 2024
Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMsMegh ThakkarYash Moreet al.2024NeurIPS 2024
Learning in Factored Domains with Information-Constrained Visual RepresentationsTyler MalloyTim Klingeret al.2022NeurIPS 2022
Efficient Black-box Planning using Macro Actions with Focused EffectsCameron AllenMichael Katzet al.2021IJCAI 2021