|
|
|
|
|
by raphlinus
1101 days ago
|
|
The most complete documentation is in the applegpu repo[1] by dougallj showing a great deal of recent activity (including by alyssarosenzweig). Last I checked, the documentation of barrier instructions wasn't complete enough to tell whether these device-scoped barriers are possible. (Note: on RDNA2, they're accomplished by DLC and GLC flags on memory accesses, combined with cache flush instructions such as S_GL1_INV). There's also a lot of great material, accessibly written, on Alyssa's blog[2], see in particular the posts titled "Dissecting the Apple M1 GPU, part ${I}". [1]: https://github.com/dougallj/applegpu [2]: https://rosenzweig.io/ |
|