|
|
|
|
|
by ethbr1
762 days ago
|
|
If you can pipeline upcoming requests and tie state to a specific request, doesn't that allow you to change how you design physical memory? (at least for inference) Stupid question, but why wouldn't {extremely large slow-write, fast-read memory} + {smaller, very fast-write memory} be a feasible hardware architecture? If you know many, many cycles ahead what you'll need to have loaded at a specific time. Or hell, maybe it's time to go back to memory bank switching. |
|