| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sebzim4500 1144 days ago
	I haven't tried this myself, but it is my understanding that finetuning does not work well in practice as a way of acquiring new knowledge. There may be a middle ground between these two approaches though. If every query used the same prompt prefix (because you only update the codebase + docs occasionally) then you could put it into the model once and cache the keys and values from the attention heads. I wonder if OpenAI does this with whatever prefix they use for ChatGPT?