Hacker News new | ask | show | jobs
by rendaw 1703 days ago
Is there anything out there that exposes a better or tighter abstraction? Something not flat?
3 comments

In practice you want memory reordering to be a thing because that's what allows you to reorder instructions that touch memory (both at compile time, and also at runtime by the processor), which is what enables a large part of the latency hiding that's going on.
There are systems like the PS3 SPEs and DSPs where you only have normal access to local on-chip memory and have to explicitly initiate DMA to access external memory.

But that's just bad for running general purpose software that can require more memory than available locally since it means you have to do memory cache management in software which is going to be much slower than letting the hardware do it.

A Dataflow architecture ISA would. It's been tried before. But, working out the entire software stack from scratch is a moonshot.
High-performance processors are data flow processors, which infer the data flow graph from the instruction stream using Tomasulo's algorithm.
The backend is very dataflow like, yes. But, the dependencies are only tracked within a tight instruction windows, the retirement (esp. of stores) are ordered to implement the memory ordering constraints of the processor and last but not least: the ISA itself is not Dataflow.