Hacker News new | ask | show | jobs
by cubefox 1072 days ago
It's not just inference time, RAM size is another bottleneck. Apple, being Apple, probably wouldn't want to offer anything less than GPT-3.5 level of intelligence. Which I would estimate at 220 billion parameters (1/8 MoE GPT-4 rumor), which would require 220 GB RAM at 8 bit parameter quantization.
1 comments

apple probably has the attention to detail to train the absolute shit out of their models. they will not need 8x220M parameters to do what GPT4 does, if they ever get to that point. see LLaMA2 7b and 13b being (subjectively) far better than LLaMA1 even with the same number of parameters, just by having been trained more

apple is known to care a lot about stuff like this. like, a lot. they are pedantic as heck