Hacker News new | ask | show | jobs
by Eisenstein 58 days ago
If the card supports vulkan and the model has gguf weights. llamacpp has excellent vulkan support that is being actively developed and is not that far behind CUDA where speed is concerned.

* https://github.com/ggml-org/llama.cpp/releases