| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by avi_vallarapu 773 days ago
	Theoretically this sounds great. I would worry about scalability issues with the Bayesian learning models practical implementation when dealing with the vast parameter space and data requirements of state of the-art models like GPT-3 and beyond. Would love to see practical implementations on large-scale datasets and in varied contexts. I Liked the use of Dirichlet distributions to approximate any prior over multinomial distributions.