|
|
|
|
|
by valine
1115 days ago
|
|
We need something like this on Linux, maybe powered by Vicuna. I’m not sure if the current batch of LLaMA variants is coherent enough to work as a digital assistant, but my gut feeling is that a little fine tuning on tool use might be all thats needed. |
|
There is also the performance issue. Right now the task energy/memory usage of llama implementations is very high, and it takes some time to load into RAM and/or VRAM. It seems Microsoft is getting around this with cloud inference, and eats the hosting cost (for now).
> little fine tuning on tool use might be all thats needed.
Maybe I am interpreting this wrong, but LORA finetuning is extremely resource intense right now. There are practical alternatives though, like embedding databases people are just now setting up.