Y
Hacker News
new
|
ask
|
show
|
jobs
by
grungegun
815 days ago
So there's no performance gain for quantization enabled by the transformer architecture? It seems very strange that quantization works so well since in most of my experiments, the internal model weights of mlps look random.