|
|
|
|
|
by SOLAR_FIELDS
1042 days ago
|
|
So I tried getting Longchat running (a 32k context llama 2 7b model released a few days ago) with FastChat and I was able to successfully get it running. It seems what I was trying to use it for (Langchain SQL agent) it is not good enough out of the box. Part of this is that I think Langchain is kind of biased towards OpenAi’s models and perhaps Llamaindex would perform better. However Llamaindex uses a newer version of sqlalchemy that a bunch of data warehouse clients don’t support yet. Unfortunately with all of the hype it seems that unless you have a REALLY beefy machine the better 70B model feels out of reach for most to run locally leaving the 7B and 13B as the only viable options outside of some quantization trickery. Or am I wrong in that? I want to focus more on larger context windows since it seems like RAG has a lot of promise so it seems like the 7B with giant context window is the best path to explore rather than focusing on getting the 70B to work locally |
|