LLM Alignment: DPO

Feb 9, 2024

A brief reading about Direct Preference Optimization method to address the alignment fine tuning for LLM. A brief reading about Direct Preference Optimization method to address the alignment fine tuning for LLM.