Hacker News new | ask | show | jobs
by fc417fc802 421 days ago
> very simple, dense and highly optimised already

Simple and dense, sure. Highly optimized in a low level math and hardware sense but not in a higher level information theoretic sense when considering the model as a whole.

Consider that quantization and compression techniques can achieve on the order of 50% size reduction. That strongly suggests to me that current models aren't structured in a very efficient manner.