Hacker News new | ask | show | jobs
by simonw 608 days ago
I have yet to see a convincing demo of fine-tuning being used to teach an LLM how to answer questions about a set of documentation.

The OpenAI docs have a list of things that fine-tuning CAN help with, and answering questions about documentation is notable absent: https://platform.openai.com/docs/guides/fine-tuning/common-u...

There are two methods that are a better bet for what you're trying to do here.

The first, if your documentation is less than maybe 50 pages of text, is to dump the entire documentation into the prompt each time. This used to be prohibitively expensive but all of the major model providers have prompt caching now which can make this option a lot cheaper.

Google Gemini can go up to 2 million tokens, so you can fit a whole lot of documentation in there.

The other, more likely option, is RAG - Retrieval Augmented Generation. That's the trick where you run searches against your documentation for pages and examples that might help answer the user's question and stuff those into the prompt.

RAG is easy to build an initial demo of and challenging to build a really GOOD implementation, but it's a well trodden path at this point to there are plenty of resources out there to help.

Here are my own notes on RAG so far: https://simonwillison.net/tags/rag/

You can quickly prototype how well these options would work using OpenAI GPTs, Claude Projects and Google's NotebookLM - each of those will let you dump in a bunch of documentation and then ask questions about it. Projects includes all of that source material in every prompt (so has a strict length limit) - GPTs and NotebookLM both implement some form of RAG, though the details are frustratingly undocumented.