| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by groups 4593 days ago
	I genuinely don't understand what "'at best' 'rather' 'unexpected'" means.

1 comments

tonydiv 4593 days ago

I'm not sure either! Either way, implementing a clustering algorithm on a GPU was thinking out of the box for us :)

link

sanskritabelt 4593 days ago

What it means is I'm hesitant to believe an 18000x speedup.

link

enos_feedler 4593 days ago

This was my reaction as well. I think there should be a whitepaper explaining the comparison as many GPU/FPGA application acceleration companies tend to do. I would say a typical GPU speedup would be in the 10-20x range with 100x being possible for highly "regular" data parallelism. I have no doubt the GPU wallclock time is correct so I am guessing the CPU implementation is just very poor.

link

tonydiv 4593 days ago

We used parallel Python to implement the first version of the clustering algorithm. Although the CPUs weren't particularly beefy, the implementation was by no means poor. It's worth mentioning that the laptop GPU wasn't powerful either.

link

sanskritabelt 4593 days ago

In that case, how much of your 18000x speedup is due to the GPU and how much of it is due to re-implementing the algorithm in a compiled language? Python->C/C++/etc is a 100x-1000x speedup right off the bat.

That's not a fair CPU/GPU comparison at all.

link

tonydiv 4593 days ago

You're right, it isn't a perfectly fair comparison. However, even if we had coded it in C++, the run time difference would still have been significant.

We're just sharing our story at this point. Next, we'll be sharing our whitepaper and data. We rather start gathering feedback now, as opposed to after writing 100k LoC for the compiler.

link

aktau 4593 days ago

Ah, python... That explains it.

link