GPU: NVIDIA GeForce RTX 4060 Ti 16gb (I typo'd the GPU above)
(This is via Ollama on Ubuntu.)
But 1-3 tokens per second is much faster than a lot of other high end models I've tried, so I was pretty pleased with it. Obviously other models run much faster on this hardware though.
The machine:
CPU: 24 × AMD Ryzen 9 9900X 12-Core Processor
RAM: 128gb
GPU: NVIDIA GeForce RTX 4060 Ti 16gb (I typo'd the GPU above)
(This is via Ollama on Ubuntu.)
But 1-3 tokens per second is much faster than a lot of other high end models I've tried, so I was pretty pleased with it. Obviously other models run much faster on this hardware though.