Having done some of this myself, I’m curious your results on fine tuning vs embeddings. I’ve found the latter much more performant, but perhaps I’m thinking about fine tuning wrong.
I used fine tuning to approximate my style. Which is especially important around my logging style as I tend to break it down in to sections, and it is stream of thought writing, example of what I mean is here: https://www.github.com/justinlloyd/retro-chores. In my logging and journals I'll crank out anywhere between a couple of hundred words and a few thousand words per day. I used embedding for adding new knowledge.
I also did a little work in letting it scan through my notebooks (they are OneNote and you can access and search via a Python API) via keyword search because it can point directly to something I've written in the past, and not just rely on model weights.
I also did a little work in letting it scan through my notebooks (they are OneNote and you can access and search via a Python API) via keyword search because it can point directly to something I've written in the past, and not just rely on model weights.