|
|
|
|
|
by cubefox
1072 days ago
|
|
It's not just inference time, RAM size is another bottleneck. Apple, being Apple, probably wouldn't want to offer anything less than GPT-3.5 level of intelligence. Which I would estimate at 220 billion parameters (1/8 MoE GPT-4 rumor), which would require 220 GB RAM at 8 bit parameter quantization. |
|
apple is known to care a lot about stuff like this. like, a lot. they are pedantic as heck