Hacker News new | ask | show | jobs
by zozbot234 14 days ago
Have you tried it? It would be slow for sure, but the main limitation AIUI would actually be storing the context in RAM - models like Kimi and GLM have high demands there which limit your ability to get meaningful aggregate throughput via large batches.
1 comments

No need to try really. 1100b weights with 256GB RAM that‘s less than 1.8 bits per weight if you want a little bit of context.

How is that supposed to give good results?