Y
Hacker News
new
|
ask
|
show
|
jobs
by
stingraycharles
84 days ago
Ok I am by no means an expert on this and I immediately stand corrected. But as I understand it, in order to understand the amount of active memory that’s required, it’s more accurate to go by the ~82B number, right?
1 comments
zozbot234
84 days ago
The ~82B figure is an attempt to compare performance to an equivalent dense model. The amount of active parameters is given by the ~17B.
link