|
|
|
|
|
by grungegun
815 days ago
|
|
Does anyone know if this works on vanilla deep networks? These quantization articles always seem to target LLM's which leads me to wonder if there's something special about the LLM architecture vs a vanilla deep architecture. |
|
(There are also positional/token embeddings and normalization but those are a tiny minority of the parameters)