Hacker News new | ask | show | jobs
by throwaway287391 3596 days ago
Yes to all of this. I'm really surprised they didn't compare costs in this blog post. Ignore the DGX-1 row of their table; the really damning comparison is between the 2nd and 4th rows of the table.

With a single 4x GPU server costing around $7k in total (row 4), you get nearly double the performance you get from spending $28k on four Xeon Phi servers (row 2).

And that's assuming you've spent the time and disk replicating your data on all four of those Xeon Phi servers, or went to a likely relatively large amount of engineering effort to ensure that network IO doesn't bottleneck training.