Y
Hacker News
new
|
ask
|
show
|
jobs
by
swiftcoder
59 days ago
Does this sort of thing scale? Would a 30B or higher model see similar performance/memory gains under this scheme?