Hacker News new | ask | show | jobs
by EGreg 412 days ago
I basically want to interface with llama.cpp via an API from Node.js

What are some of the best coding models that run locally today? Do they have prompt caching support?