|
|
|
|
|
by d4rkp4ttern
1171 days ago
|
|
Thanks for clarifying. The support for local LLMs seems very interesting — would a haystack agent call out to a separately “running” self-hosted LLM via an API (REST, etc) or would it need to actually load up the model and directly query it (e.g model.generate(<prompt>) ) ? Also it seems like the functionality of haystack subsumes those of langchain and llama-index (fka GPT-index) ? |
|
So back to your question: We will enable both ways in Haystack: 1) Loading a local model directly via Haystack AND 2) quering self-hosted models via REST (e.g. Huggingface running on AWS SageMaker). Our philosophy here: The model provider should be independent from your application logic and easy to switch.
In the current version, we support for local models only option 1. This works for many of the provided models provided by HuggingFace, e.g. flan-t5. We are already working on adding support for more open-source models (e.g. alpaca) as models like Flan-T5 don't perform great when used in Agents. The support for sagemaker endpoints is also on our list. Any options you'd like to see here?