| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bigmadshoe 7 days ago
	I agree, but since we're talking about imagine understanding with text output, clearly a CNN is unsuitable. My previous comment was overly reductive and CNNs can still be SoTA depending on your performance metrics. I spent the earlier part of my career training CNNs, and they are very pleasant to work with.

1 comments

You can run a CNN and use the downsampled feature map the same way as patch tokens.