Y
Hacker News
new
|
ask
|
show
|
jobs
by
bigyabai
54 days ago
You won't be RAM caching much of anything with experts that are 220b parameters worth of layers.