| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by TuringTest 1201 days ago
	Fortunately, there still are some possibilities to improve training efficiency and reducing model size by doing more guided attentional learning. This will make feasible to train models at least as good as the current batch (though probably the big players will use those same optimizations to create much better large models).