|
|
|
|
|
by drillsteps5
481 days ago
|
|
By not building tooling in Python. llamacpp is a high performative open source solution capable of inference of large number of published LL models both in CPU and in GPU. It's written in C++. It's easy to download and build in Windows or Linux. It can be used as a command line tool, linked and used as a library from a variety of languages, including Python, or communicated with through a simple REST service which is also part of the same repo. It even has a simple Web frontend (built with React I believe) which allows you to use it for simple conversations (no bells and whistles). And yet the author is using Ollama which itself is a wrapper around llamacpp (as most of them are) written in Python. We're creating the problems that need soling. |
|