Direct Preference Optimization
Overview Download the presentation slides Today in our AI/ML seminar, we were pleased to have Maliha Zahan Chowdhury as the presenter. Maliha is a first-year PhD student in Dr. Zhishuai Guo’s research group. Her talk focused on Direct Preference Optimization (DPO), primarily based on the original 2023 DPO paper. Although the paper was published only three years ago, it has already been cited nearly 8,000 times, reflecting its significant impact on LLM alignment and optimization research. ...