| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stuaxo 40 days ago
	Nice. What did you use to do this, something standard like llamacpp or something else like vllm or your own contraption ?

1 comments

pianopatrick 40 days ago

llama.cpp

It's now spit out about 40 tokens after maybe 18 hours and has not finished the "thinking" stage of responding to the prompt. I'll let it keep running to see what happens

link