| Since nobody is actually recommending papers, here's an incomplete reading list that I sent out to some masters students I work with so they can understand the current research (academic) my little team is doing: Paper reference / main takeaways / link instructGPT / main concepts of instruction tuning / https://proceedings.neurips.cc/paper_files/paper/2022/hash/b... self-instruct / bootstrap off models own generations / https://arxiv.org/pdf/2212.10560.pdf Alpaca / how alpaca was trained / https://crfm.stanford.edu/2023/03/13/alpaca.html Llama 2 / probably the best chat model we can train on, focus on training method. / https://arxiv.org/abs/2307.09288 LongAlpaca / One of many ways to extend context, and a useful dataset / https://arxiv.org/abs/2309.12307 PPO / important training method / idk just watch a youtube video Obviously these are specific to my work and are out of date by ~3-4 months but I think they do capture the spirit of "how do we train LLMs on a single GPU and no annotation team" and are frequently referenced simply by what I put in the "paper reference" column. |