| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by discobot 919 days ago
	regarding the prompt processing and token generation you are correct it makes sense to benchmark them infependently since prompt processing is done in parralel for each token and is compute bound and token generation is sequential and bound by memory banwidth