Hacker News new | ask | show | jobs
by pants2 3 days ago
With a tps and a token price you can calculate approx. price per hour of running the model!

$2.61/M tokens * 1,000 tok/s = $9.40/hr

That would be pretty cheap for an 8-GPU node which would typically run around $45/hr or more. Guess this depends on how many parallel streams it can handle.