| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Eridrus 1805 days ago
	Being able to tell if a model has been trained enough without reference to a separate dev set seems like a useful capability, but how can you actually turn these plots into a decision criteria? Why is a modal alpha of 4 high, but an alpha of 3.5 ok?

1 comments

charleshmartin 1805 days ago

Great question. 4 is at the high edge of the fat tailed universality class. Most high performing models have alpha approaching 2, or at least below 3. See Figure 8(a) in the Nature paper, and our upcoming JMLR paper https://arxiv.org/abs/1810.01075

link