| > This isn't latency in the same sense I was using the word. But it is. If you add more factories for producing advanced chips, you can produce a chemical science pack from start to finish in 24 seconds (assuming producing an advanced circuit takes 16 seconds). Otherwise it takes 48 seconds, because you’re waiting sequentially for 3 advanced circuits to be completed. It doesn’t matter that the latency of producing an advanced circuit didn’t decrease. The relevant metric is the latency to produce a chemical science pack, which _did_ decrease, by fanning out the production of a sub-component. Edit: actually, my numbers are measuring reciprocal throughput, but the statement still holds true when talking about latency. You can expect to complete a science pack in 72 seconds (24+16*3) with no parallelism, and 40 seconds (24+16) with. > Yes, a CPU core is able to break instructions down into micro-operations and parallelize and reorder those micro-operations, such that instructions are retired in a non-linear manner. Which is why you don't measure latency at the instruction level. That’s what I’m saying about Factorio, though. You can measure latency for individual components, and you can measure latency for a whole pipeline. Adding parallelism can decrease latency for a pipeline, even though it didn’t decrease latency for a single component. That’s why the idea that serial performance = latency breaks down. |
That's still reciprocal throughput. 1/(science packs/second). You're measuring the time delta delta between the production of two consecutive science packs, but this measurement implicitly hides all the work the rest of the factory did in parallel. If the factory is completely inactive, how soon can it produce a single science pack? That time is the latency.
>You can measure latency for individual components, and you can measure latency for a whole pipeline. Adding parallelism can decrease latency for a pipeline, even though it didn’t decrease latency for a single component. That’s why the idea that serial performance = latency breaks down.
Suppose instead of producing science packs, your factory produces colored cars. A customer can come along, press a button to request a car of a given color, and the factory gives it to them after a certain time. You want to answer customer requests as quickly as possible, so you always have ready black, white, and red cars, which are 99% of the requests, and your factory continuously produces cars in a red-green-blue pattern, at a rate 1 car per hour. Unfortunately your manufacturing process is such that the color must be set very easy in the pipeline and this changes the entire rest of the production sequence. If a customer comes along and presses a button, how long do they need to wait until they can get their car? That measurement is the latency of the system.
The best case is when there's a car already ready, so the minimum latency is 0 seconds. If two customers request the same color one after the other, the second one may need to wait up to three hours for the pipeline to complete a three-color cycle. But what if a customer wants a blue car? I've only been talking about throughput. Nothing of what I've said so far tells you how deep the pipeline is. It's entirely possible that even though your factory produces a red car every three hours, producing a blue car takes you three months. If you add an exact copy of the factory you can produce two red cards every three hours, but producing a single blue car still takes three months.
Adding parallelism can only affect the optimistic paths through a system, but it has no effect on the maximum latency. The only way to reduce maximum latency is to move more quickly through the pipeline (faster processor) or to shorten the pipeline (algorithmic optimization). You can't have a baby in one month by impregnating nine women.