LangChain supports local LLMs like Llama 2 with Ollama (https://github.com/jmorganca/ollama) as of this morning, in both their Python and Javascript versions:
This can be a great option if you'd like to keep your data local versus submitting it to a cloud LLM, with the added benefit of saving costs if you're submitting many questions in a row (e.g. in batches)
Thanks. How would this differ from running Llama2 through the Huggingface-Langchain integration? I haven't tried it but it looked like the way to go until you shared this.