Hacker News new | ask | show | jobs
by chessgecko 806 days ago
I thought MTIA v2 would use the mx formats https://arxiv.org/pdf/2302.08007.pdf, guess they were too far along in the process to get it in this time.

Still this looks like it would make for an amazing prosumer home ai setup. Could probably fit 12 accelerators on a wall outlet with change for a cpu, would have enough memory to serve a 2T model at 4bit and reasonable dense performance for small training runs and image stuff. Potentially not costing too much to make either without having to pay for cowos or hbm.

I'd definitely buy one if they ever decided to sell it and could keep the price under like $800/accelerator.

1 comments

I suppose it might, there are not a lot of details (what kind of sparsity for example?) about what they mean in terms of INT8 support - it could be MXINT8, or something else.

Glad someone was thinking the same thing I was though!

its gotta be that 2/4 sparsity that everyone has, but I haven't seen used anywhere right? If they put it in though they must be using it, but I'm not sure for what. And without details I think its a good bet that int8 is the standard int8.

Wishful thinking maybe they'll announce selling it with the giant llama3 cause there's no good, cheap way to inference something like that at home at the moment and this could change that.