Hacker News new | ask | show | jobs
by jhartmann 4795 days ago
Actually you forget that performance when you need to train for days at a time is critical, if I use Octave/Matlab/R my current project might take months to train instead of weeks. All my ML code is high performance threaded C++. I recommend you use a good template linear algebra library like Eigen, you can do plenty of experimentation in C++. I find with a set of a few modern libraries and the required experience a C++ programmer is just as if not more efficient than a Python/R/Matlab programmer. It comes down to the skill of the programmer and the proper choice of libraries.
1 comments

True that matlab octave and R are all rubbish for performance. I use python + numpy which all delegates to BLAS for the hardcore linear algebra stuff. I don't normally find C++ gains me all that much. You can also do GPU acceleration pretty easy using theano (e.g. http://deeplearning.net/software/theano/tutorial/using_gpu.h...)

So I reckon my GPU accelerated python still beats a C++ pthreads approach, and is alot faster to develop on.

Your mileage may vary, from what you said you probably know what you are doing, maybe GPU is not applicable. I was really replying to the initial comments that said they want to start learning machine learning on a C++ system. Training for days suggests you are doing something hardcore like MCMC/DBN/Guassian Processes, learners should not start there though....

I'm doing deep belief networks with dropout, and don't have access to GPU's with good double precision performance. I used to write graphics device drivers, so GPU computing has a special place in my heart and definitely agree with you there performance wise. It is funny though that my little laptop is hitting training times similar to some papers where people are using low end GPU's though, its amazing what you can do when you pay attention to performance.

I suspect my tuned C++ code will work quite well on a Intel MIC, and that is probably where I'm going to go when I have more resources to throw at the problem. I do know that Theano does use Alex's C++ CUDA code under the covers and I have done lots of reading of some of theano's code looking at implementation details to help developing my code. I just am not a big python (or most scripting languages actually) fan, perhaps I'm just too old school and written C, C++, C# and Java too long. If it doesn't smell or feel like C, I feel like Scotty in Star Trek 4 when he was making the transparent aluminum on the mac.