Hacker News new | ask | show | jobs
by denz88 699 days ago
I'm glad to see the nice incremental gains on the benchmarks for the 8B and 70B models as well.
1 comments

Some of those benchmarks show quite significant gains. Going from Llama-3 to Llama-3.1, MMLU scores for 8B are up from 65.3 to 73.0, and 70B are up from 80.9 to 86.0. These scores should always be taken with a grain of salt, but this is encouraging.

405B is hopelessly out of reach for running in a homelab without spending thousands of dollars. For most people wanting to try out the 405B model, the best option is to rent compute from a datacenter. Looking forward to seeing what it can accomplish.

How much can you quantize that down to run on a Mac Studio with 192GB? Is it possible? Feels like it would have to be 2bit…
Less than 2bit i think. There's this IQ2 quant that fits