| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sdenton4 660 days ago
	Perhaps; in a lot of cases the architecture barely matters. Transformers took a lot of extra tricks to get working well; the ConvNext paper showed that applying those same tricks to convolutional networks can fully close the gap. https://arxiv.org/abs/2201.03545