Hacker News new | ask | show | jobs
by nl 1117 days ago
Thinking of it as "the training dataset" vs "the context window" is the wrong way of looking at it.

There's a bunch of prior art for adaption techniques for getting new data into a trained model (fine tuning, RLHF etc). There's no real reason to think there won't be more techniques that turn what think of now as the context window into something that alters the weights in the model and is serialized back to disk.

1 comments

It's a reasonable way to look at it given that's how pretty much all 'deployed' versions of LLM's work?