|
|
|
|
|
by brucethemoose2
1055 days ago
|
|
Prompt ingestion is too slow on the Oracle VMs. Also its really tricky to even build llama.cpp with a BLAS library, to make prompt ingestion less slow. The Oracle Linux OpenBLAS build isnt detected ootb, and it doesn't perform well compared to x86 for some reason. LLVM/GCC have some kind of issue identifying the Ampere ARM architecture (march=native doesn't really work), so maybe this could be improved with the right compiler flags? |
|