| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MichaelZuo 826 days ago
	> Aside from the minuscule context length, it also lacks the instruction tuning and reinforcement learning from human feedback (RLHF) that turn a large language model into a chatbot. Is RLHF even strictly necessary?

2 comments

ianand 826 days ago

Strictly necessary? Maybe not. I wrote that before URIAL [1][2]. I actually haven't tried URIAL in GPT2 small but I need to give it a whirl. Might be too small a model to work?

Even if URIAL works with GPT2 small, the really small context length in the Excel file as currently implemented will make it hard to leverage. I've considered a more flexible implementation to support a longer context length (e.g. using Macros to build the layout of the sheet) but have prioritized the teaching videos first.

[1] https://allenai.github.io/re-align/index.html [2] Summary https://twitter.com/intuitmachine/status/1732089266883141856

link

warkdarrior 825 days ago

> https://news.ycombinator.com/item?id=39700256

Holy color use, Batman! Someone take the crayons away from that web designer.

link

littlestymaar 826 days ago

By default it's just going to be a text completion model, you want an additional round of training to make it behave like a chatbot. I guess you could probably get away with just fine-tuning on chatbot discussions, but everybody uses RLHF so I guess it must be much more efficient for that.

link