| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by SeanAppleby 846 days ago

I am quite confident that at least some use cases for injecting context in at inference time are going to stay for at least the foreseeable future, regardless of model performance and scaling improvements, because IME those aren't the primary problems the pattern solves for me.

If you are dealing with highly cardinal permissioning models (even just a large number of users who own their own data, but the problem compounds if you have overlapping permissions), then tuning a separate set of layers for every permission set is always going to be wasteful. Trusting a model to have some kind of "understanding" of its permissioning seems plausible assuming some kind of omniscient and perfectly aligned machine, but unrealistic in the foreseeable future and definitely not going to cut it for data regs.

Also, in current status quo I don't believe there is a solution on the horizon for continuous, rapid incremental training in prod, so any data sources that change often are also going to be best addressed in this way. That will most likely be solved at some point, but it doesn't seem imminent, and regardless there will likely be some balancing of cost/performance where context from after the watermark being injected in at inference time might still make sense anyway to keep training costs managable rather than having to iterate training on literally every single interaction.

But yeah, if you're just using it because you have a single collection of context for many users which is too large to fit into the prompt, that seems like it will be subject to the problem you're describing. Although there might still be some benefit to cost/performance optimization both to keeping the prompt short (for cost) and focused (for performance).