| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by casercaramel144 911 days ago
	I'm sorry, I don't understand the exact contribution here? There's many tutorials on how to train a language model. If it's a repository of SOTA techniques for training, this will be outdated in at max 3 months, and anyways the ground shifts under you in this field so you might as well read Arxiv all day if your intention is to keep up with SOTA.

2 comments

chuckhend 911 days ago

It looks like this team gave us everything we need to reproduce their models, the actual artifacts needed to reproduce it. As far as I can tell, they share the data and every step along the way to final model...not just describing what they did.

link

tkellogg 911 days ago

researchers don't read tutorials, they cross check each other's work. You need details to do that.

link

casercaramel144 911 days ago

wdym by cross check each others work? Surely just reporting the final loss is good enough if that's the intention. The final end goal is lower loss anyways so it's not even a bad metric.

link