Hacker News new | ask | show | jobs
by hobofan 883 days ago
You can "use RAG" with Ollama, in the sense that you can put RAG chunks into a completion prompt.

To index documents for RAG, Ollama also offers an embedding endpoint where you can use LLM models to generate embeddings, however AFAIK that is very inefficient. You'd usually want to use a much smaller embedding model like JINA v2[0], which are currently not supported by Ollama[1].

[0]: https://huggingface.co/jinaai/jina-embeddings-v2-base-en

[1]: https://github.com/ollama/ollama/issues/327