|
|
|
|
|
by bee_rider
58 days ago
|
|
Were there any downsides or difficulties? It would be sort of surprising if an SVD-based opportunity was missed (since it is such a familiar tool). But, your entropy and least-squares ideas are necessary to set that up, so I guess it makes sense that you’d find some new territory here. |
|
On downsides: definitely a few. The biggest one is latency - SVD is fairly heavy, so even though it’s amortized (runs periodically, not per token), it still adds noticeable overhead. It’s also more complex than simple pruning, and I haven’t validated how well this holds on real downstream tasks yet.
This is very much a research prototype right now more about exploring a different tradeoff space than something ready for production.