Hacker News new | ask | show | jobs
by talldayo 791 days ago
Most inference servers are OpenAI-compatibile. Even the "official" llama-cpp server should work fine: https://github.com/ggerganov/llama.cpp/blob/master/examples/...