| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blt 764 days ago
	the simplicity of the transformer is quite refreshing. especially in vision where the Vision Transformer with linear patch encodings replaces complex intertwined decisions about filter size, striding, pooling, #filters, depth, etc., with the simpler decision of how to allocate your FLOPS between dimensionality, #heads, and #layers.