| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by j_coder 2482 days ago
	I can imagine on a few decades company's data center on a single square meter chip :)

2 comments

mises 2482 days ago

...and the rest of the space being taken by a colossal cooling system.

link

olliej 2482 days ago

That would be colossally inefficient - essentially the size of the chip means that electrons would be taking multiple cycles to get from one side to the other. The solution would be localizing processing into distinct processing units on the one die. At the point you’ve reinvented multiple cores and it starts becoming cost effective to split them into separate chips to improve yields :)

link

deepnotderp 2482 days ago

The advantage would be interconnect energy and performance.

To an extent this has already happened with wafer scale integration, e.g. cerebras.

link

olliej 2481 days ago

The problem is the increase power usage of the additional caches that are necessary - modern CPUs already need a bunch of physically local caches in addition to the large L1/2/3/n caches because of timing of flowing electrons from A to B. At some point the benefit of larger single die becomes minimal. The moment that happens you benefit from making separate chips because of increased yield.

link

pixl97 2482 days ago

That is only on a clock locked chip. There are chips with designs where different parts of a chip where clocks run at different cycles.

link

olliej 2481 days ago

Most modern chips already use numerous clocks (aside from anything else propagation delays for the clock signal is already a problem).

The problem is not simply "because clock cycle" it is "if electron takes Xns to get from one execution unit to the next, then that's Xns of functionally idle time". That at best means additional latency. The more latency involved in computing a result the more predictive logic you need - for dependent operations the latency matters.

An asynchronous chip does not avoid that same problems encountered by a multistage pipelined processor, it's purely a different way to manage varying instruction execution times.

But this doesn't answer the killer problem of yield. The larger a single chip is the more likely any given chip is to have errors, and therefore the fewer chips you get out of a given wafer after the multiple weeks/months that wafer has been trundling through a fab. Modern chips put a lot of redundancy in to maximize the chance that sufficient parts of a given core survive manufacture to allow a complete chip to function, eg. more fabricated cache and execution units than necessary, at the end of manufacture any components that have errors are in effect lasered out. If at that point any chip doesn't have enough remaining cache/execution units, or an error occurs where it can't be redundant, the entire chip is dead.

The larger a given die is the greater the chance that the entire die will be written off.

That massive ML chip a few days ago worked by massively over prescribing execution units. I suspect that they end up with much greater lost area of a given wafer than many small chips, which directly contributes to actual cost.

link

deepnotderp 2482 days ago

Most complex chips today already have multiple clock domains, don't see why that would be a problem.

link

verall 2482 days ago

Yea, why even use synchronous clocks? Asynch designs have been around since the 80s /s

link