| HN Mirror

Does that mean you tested on specific questions? Get 1-5 typical queries and test them with a properly configured llamaindex.

If your documents repeat the same information several different ways then you actually might get something out of LoRA on raw documents. But you need a way to measure it and you have to verify that RAG won't work with real tests first.

To do effective training with LoRA though and expect it to pick up most of the information reliably then you need to cover the knowledge and skills with multiple question answer pairs for each item you expect it to learn. Which you can then use QA pairs to validate that it learned those things.

But it's a lot of QA pair generation.