Hacker News new | ask | show | jobs
by ankitmathur 1157 days ago
Hey there! I'm one of the folks working on Dolly - Dolly-V2 is based on the GPT-NeoX architecture. llama.cpp is a really cool library that was built to optimize the execution of the Llama architecture from Facebook on CPUs, and as such, it doesn't really support this other architecture at this time from what I understand. Llama also features most of the tricks used in GPT-NeoX (and probably more), so I can't imagine it's a super heavy lift to add support for NeoX and GPT-J in the library.

We couldn't use Llama because we wanted to use a model that was able to be used for commercial use, and the Llama weights aren't available for that kind of use.