Retrieval-Augmented Generation for Large Language Models: A Survey
https://arxiv.org/abs/2312.10997
The photos of this post are also good for a high level look
https://twitter.com/dotey/status/1738400607336120573/photo/2
From the various posts I have seen people claim that phi-2 is a good model to start off from.
If you just want to do embeddings, there are various tutorials to use pgvector for that.