|
|
|
|
|
by nl
810 days ago
|
|
This (the device='mps' version) already uses the unified memory plus GPU on M-series Macs. It's possible MLX has some additional micro optimizations, but in general most people who have tried it out against hand-written MPS based training implementations haven't found great speed ups yet. |
|