Y
Hacker News
new
|
ask
|
show
|
jobs
by
Tepix
12 days ago
No need to try really. 1100b weights with 256GB RAM that‘s less than 1.8 bits per weight if you want a little bit of context.
How is that supposed to give good results?