Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar*,
Anikait Singh*,
Archit Sharma,
Rafael Rafailov,
Jeff Schneider,
Tengyang Xie,
Stefano Ermon,
Chelsea Finn, and
Aviral Kumar
International Conference on Machine Learning (ICML), 2024
[Paper],
[Code],
[Project Website]
|
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Yoonho Lee*,
Annie S Chen*,
Fahim Tajwar,
Ananya Kumar,
Huaxiu Yao,
Percy Liang, and
Chelsea Finn
International Conference on Learning Representations (ICLR), 2023
[Paper],
[Code]
|
When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Annie Xie*,
Fahim Tajwar*,
Archit Sharma*, and
Chelsea Finn
Neural Information Processing Systems (NeurIPS), 2022
[Paper],
[Code],
[Project Website]
|
Do Deep Networks Transfer Invariances Across Classes?
Allan Zhou*,
Fahim Tajwar*,
Alexander Robey,
Tom Knowles,
George J Pappas,
Hamed Hassani, and
Chelsea Finn
International Conference on Learning Representations (ICLR), 2022
[Paper],
[Code]
|
Conservative Prediction via Data-Driven Confidence Minimization
Caroline Choi*,
Fahim Tajwar*,
Yoonho Lee*,
Huaxiu Yao,
Ananya Kumar, and
Chelsea Finn
Transactions on Machine Learning Research (TMLR), 2024
[Paper],
[Code]
|
Scalable deep learning to identify brick kilns and aid regulatory capacity
Jihyeon Lee*,
Nina R. Brooks*,
Fahim Tajwar,
Marshall Burke,
Stefano Ermon,
David B. Lobell,
Debashish Biswas, and
Stephen Luby
Proceedings of the National Academy of Sciences (PNAS), 2021
[Paper],
[Code]
|
Preprints/Workshop Publications
|
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias
Max Sobol Mark*,
Archit Sharma*,
Fahim Tajwar,
Rafael Rafailov,
Sergey Levine, and
Chelsea Finn
Preprint, 2023
[Paper],
[Code]
|
No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Fahim Tajwar,
Ananya Kumar*,
Sang Michael Xie*, and
Percy Liang
ICML Workshop on Uncertainty & Robustness in Deep Learning (UDL), 2021
[Paper],
[Code]
|
Website template
|