Tri Nguyen

4-th year Ph.D. student

LLM Alignment

DPO, alignment, preference

A brief reading about Direct Preference Optimization method to address the alignment fine tuning for LLM.

Community Detection with Matrix Decomposition

matrix factorization, community detection

An introduction to community detection problem with a view from matrix factorization.

BibMan - A Keyboard-based Bibliography Manager

ncurse python vim

I want a software to manage and organize collections of papers in my way. Mendeley is closed source so it is out of the picture. Zotero is pretty nice and open source with extensible capabilities through extensions. However, as a Vim and Ranger user, I crave for a purely keyboard driven interface. And I could not find any thing like that. So I wrote one.

Some random papers on LLM

llm attention

We have readings on the trendy LLM. I collected some papers myself here. The list is still updating

My take on ICML2023

diffusion prepresentation-learning

These are a few papers I found interesting either by the work itself or the concepts/techniques it used, although the concept/techniques might be old.

Reinforcement Learning is So Confusing

RL

There are many algorithms presenting in RL in a very intuitive way, but looks a bit heuristic. While re-reading Reinforcement Learning as an attempt to get rid of that heuristic feeling, I’ve tried to digest it under an optimization perspective. And well, I realized I couldn’t make any connection whatsoever from optimization understanding to any algorithm presenting in RL.

So this is an attempt to make thing more concrete under a somewhat first principle view.

The note is currently very unorganized. Link