Y
Hacker News
new
|
ask
|
show
|
jobs
by
upghost
560 days ago
Great point about the meaningful datasets, this makes perfect sense. Esp. in regards to SFT and RLHF. Although I suppose it would be somewhat easier to do pretraining on really long context (books, I assume?)