| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by SOLAR_FIELDS 1089 days ago

So I tried getting Longchat running (a 32k context llama 2 7b model released a few days ago) with FastChat and I was able to successfully get it running. It seems what I was trying to use it for (Langchain SQL agent) it is not good enough out of the box. Part of this is that I think Langchain is kind of biased towards OpenAi’s models and perhaps Llamaindex would perform better. However Llamaindex uses a newer version of sqlalchemy that a bunch of data warehouse clients don’t support yet.

Unfortunately with all of the hype it seems that unless you have a REALLY beefy machine the better 70B model feels out of reach for most to run locally leaving the 7B and 13B as the only viable options outside of some quantization trickery. Or am I wrong in that?

I want to focus more on larger context windows since it seems like RAG has a lot of promise so it seems like the 7B with giant context window is the best path to explore rather than focusing on getting the 70B to work locally

2 comments

cube2222 1089 days ago

In the Llama 2 paper benchmarks they did mention that Llama 2 is much worse at any kind of code generation than the OpenAI models, they were optimizing for conversational / natural language use-cases.

link

SOLAR_FIELDS 1089 days ago

Interesting, what other openly licensed models are better at codegen? Or perhaps there is a version of llama 2 already fine tuned for codegen? There is starcoder but I had also not had great results with that one in my brief experiments

link

lhl 1089 days ago

WizardCoder-15B (an evol-instruct starcoder fine-tune) is probably the best performing open model atm: https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder

link

spmurrayzzz 1089 days ago

I haven't tested the newest implementations of every large context window model, so I'm not sure how prevalent this issue still is, but generally speaking the context window tends to be U-shaped. In other words, the model seems to forget/ignore everything in the middle. So YMMV if you're trying to implement RAG-esque methods with them.

More reading on that problem if you're curious: https://arxiv.org/pdf/2307.03172.pdf

link