Y
Hacker News
new
|
ask
|
show
|
jobs
by
talldayo
791 days ago
Most inference servers are OpenAI-compatibile. Even the "official" llama-cpp server should work fine:
https://github.com/ggerganov/llama.cpp/blob/master/examples/...