#Rlhf — blogs.social

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

LoRA (Low-Rank Adaptation) is a fine-tuning method introduced by Hu et al. at Microsoft in 2021. Instead of updating all billions of parameters in a large model, LoRA freezes the original weights and injects trainable low-rank matrices into each transformer layer. The insight: weight updates during fine-tuning have low "intrinsic rank", most of the useful signal lives in a much smaller…