Hacker News new | ask | show | jobs
by soulofmischief 28 days ago
Each model needs to be a separate copy, or at least have those particular weights be interchangeable, for every single user.

Remember Microsoft Tay.

https://en.wikipedia.org/wiki/Tay_(chatbot)#Initial_release

1 comments

Yes, since the weights being updated are a small subset of the overall total it's manageable. Just like how each separate conversation currently requires you to store a separate KV cache, you'd need to store the fast weights separately. Both KV cache and fast weight content stores have to be conversation specific, so just setting a bit of extra RAM aside for "memory" isn't really a new ask, just a different format for an old problem.