| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thriw63748 876 days ago
	Why not normal RAM? Ryzen 5600 with 128GB DDR4 is perfectly fine to run mixtral 8bit, and costs less than $1000. GPUs are only needed if you can not wait 5 minutes for an answer, or for training.

7 comments

snowfield 876 days ago

Or if you want multiple sessions at the same time. Or if you want to do anything else with your machine while it's running.

But realistically, 5 minutes is too long. It should be conversational, and for that you need at least 5 tokens per second. Which your Ryzen just can't do.

link

MPSimmons 876 days ago

>It should be conversational, and for that you need at least 5 tokens per second.

To be fair, a lot of people are using this for non-interactive work, like batching document analysis or offline processing of user generated content.

link

Diti 876 days ago

This particular thread we are commenting on is about Dolphin Mixtral, which is mostly used for offline code completion (à là Microsoft GitHub Copilot). You don’t want to have to wait 5 minutes at every keystroke to get code suggestions.

link

Gracana 876 days ago

In my experience, it takes some experimentation to figure out a good prompt. I don’t think I would have gotten very far off I had to wait that long for each result.

link

irusensei 876 days ago

Why not both? Llama.cpp allows layering GGUF models between GPU and CPU memory.

link

dragonwriter 876 days ago

> GPUs are only needed if you can not wait 5 minutes for an answer

Yeah, but that's generally true (or at least, “5 minutes for an answer is very suboptimal”, even if “can’t” isn’t quite true) for interactive use cases, which are... a lot of LLM use cases.

link

juliangoldsmith 876 days ago

Not sure why you're getting downvoted. It performs decent enough on my Ryzen 3600X with 64GB of RAM. It definitely wouldn't be usable for production or fine-tuning, but it's fine for experimenting.

link

brucethemoose2 876 days ago

> perfectly fine

Only for very short context and responses.

Beyond that, the performance is painful.

link

rhdunn 876 days ago

That was what I was referring to with the 32/64 GB systems.

link

SkyMarshal 876 days ago

What's the bandwidth between the Ryzen and that DDR4?

link