Maximum Likelihood Reinforcement Learning
Fahim Tajwar*,
Guanning Zeng*,
Yueer Zhou,
Yuda Song,
Daman Arora,
Yiding Jiang,
Jeff Schneider,
Ruslan Salakhutdinov,
Haiwen Feng, and
Andrea Zanette
International Conference on Machine Learning (ICML), 2026 (Oral)
Workshop on Scaling Post-training for LLMs (SPOT) @ ICLR , 2026 (Best Paper Award)
[Paper],
[Code],
[Project Website]
|
Expanding the Capabilities of Reinforcement Learning via Text Feedback
Yuda Song*,
Lili Chen*,
Fahim Tajwar,
Rémi Munos,
Deepak Pathak,
Drew Bagnell,
Aarti Singh, and
Andrea Zanette
International Conference on Machine Learning (ICML), 2026
Workshop on Lifelong Agents: Learning, Aligning, Evolving (LLA) @ ICLR , 2026 (Outstanding Paper Award)
[Paper],
[Code],
[Project Website]
|
Reasoning as an Adaptive Defense for Safety
Taeyoun Kim,
Fahim Tajwar,
Aditi Raghunathan, and
Aviral Kumar
Neural Information Processing Systems (NeurIPS), 2025
[Paper],
[Code],
[Project Website]
|
Training a Generally Curious Agent
Fahim Tajwar*,
Yiding Jiang*,
Abitha Thankaraj,
Sumaita Sadia Rahman,
J Zico Kolter,
Jeff Schneider, and
Ruslan Salakhutdinov
International Conference on Machine Learning (ICML), 2025 (Oral)
[Paper],
[Code],
[Project Website]
|
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar*,
Anikait Singh*,
Archit Sharma,
Rafael Rafailov,
Jeff Schneider,
Tengyang Xie,
Stefano Ermon,
Chelsea Finn, and
Aviral Kumar
International Conference on Machine Learning (ICML), 2024
[Paper],
[Code],
[Project Website]
|
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Yoonho Lee*,
Annie S Chen*,
Fahim Tajwar,
Ananya Kumar,
Huaxiu Yao,
Percy Liang, and
Chelsea Finn
International Conference on Learning Representations (ICLR), 2023
[Paper],
[Code]
|
When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Annie Xie*,
Fahim Tajwar*,
Archit Sharma*, and
Chelsea Finn
Neural Information Processing Systems (NeurIPS), 2022
[Paper],
[Code],
[Project Website]
|
Do Deep Networks Transfer Invariances Across Classes?
Allan Zhou*,
Fahim Tajwar*,
Alexander Robey,
Tom Knowles,
George J Pappas,
Hamed Hassani, and
Chelsea Finn
International Conference on Learning Representations (ICLR), 2022
[Paper],
[Code]
|
Conservative Prediction via Data-Driven Confidence Minimization
Caroline Choi*,
Fahim Tajwar*,
Yoonho Lee*,
Huaxiu Yao,
Ananya Kumar, and
Chelsea Finn
Transactions on Machine Learning Research (TMLR), 2024
[Paper],
[Code]
|
Scalable deep learning to identify brick kilns and aid regulatory capacity
Jihyeon Lee*,
Nina R. Brooks*,
Fahim Tajwar,
Marshall Burke,
Stefano Ermon,
David B. Lobell,
Debashish Biswas, and
Stephen Luby
Proceedings of the National Academy of Sciences (PNAS), 2021
[Paper],
[Code]
|
Website template
|