Hacker News new | ask | show | jobs
by sonzohan 14 days ago
What are your settings and tokens/second? Even with 2 GPUs (MI100, RX 6600 XT 8GB) and 32GB of RAM it was running at a snails pace for me.

I didn't try a sched_spread with a 3090 and the MI100 which would provide 56GB ram

1 comments

It's not speedy. I get 1-3 tokens per second.

The machine:

CPU: 24 × AMD Ryzen 9 9900X 12-Core Processor

RAM: 128gb

GPU: NVIDIA GeForce RTX 4060 Ti 16gb (I typo'd the GPU above)

(This is via Ollama on Ubuntu.)

But 1-3 tokens per second is much faster than a lot of other high end models I've tried, so I was pretty pleased with it. Obviously other models run much faster on this hardware though.