|
|
|
|
|
by storus
159 days ago
|
|
I have MacStudio with 512GB RAM, 2x DGX Spark and RTX 6000 Pro WS (planing to buy a few of those in Max-Q version next). I am wondering if we ever see local inference so "cheap" as we see it right now given RAM/SSD price trends. |
|
What kind of experiments are you doing? Did you try out exo with a dgx doing prefill and the mac doing decode?
I'm also totally interested in hearing what you have learned working with all this gear. Did you buy all this stuff out of pocket to work with?