| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by swyx 1023 days ago
	interesting. but there should be physical limits to that that we can handicap to put bounds on speculation. so for example, FLOPS/s has an upper bound and you can make latency estimates for 1/10/100B models. this would put reasonable bounds for statements like "a hundred entire responses in the time it takes for one token to be shown"