|
|
|
|
|
by brucethemoose2
858 days ago
|
|
> the more technical users will leverage llama.cpp to run whatever models they are interested in. Llama.cpp is much slower, and does not have built-in RAG. TRT-LLM is a finicky deployment grade framework, and TBH having it packaged into a one click install with llama index is very cool. The RAG in particular is beyond what most local LLM UIs do out-of-the-box. |
|