Y
Hacker News
new
|
ask
|
show
|
jobs
by
xrd
1115 days ago
Looks like no quantized options with llama.cpp?
https://github.com/ggerganov/llama.cpp/issues/1602
1 comments
dvilasuero
1115 days ago
We're very much looking forward to seeing Falcon-40B support on llama.cpp. For production use cases, this is also highly relevant:
https://huggingface.co/blog/sagemaker-huggingface-llm
link