Hacker News new | ask | show | jobs
by hack_ml 748 days ago
There is dask cudf which gets a lot of the way there.

https://docs.rapids.ai/api/dask-cudf/stable/

3 comments

Last time I’ve used it, Dask was a lot worse than simple manual batching.
This is huge, this was my only gripe with cudf!
I did a conversion of 500GB of data using dask_cudf on a GTX 1060 with 6GB of VRAM and was able to do it faster than a 20 node m3.xlarge Cluster.

What you can do on even consumer GPU's is mind blowing.

How does it perform when it comes to plotting these large data points? Can I use matplotlib?