Hacker News new | ask | show | jobs
by Lordarminius 3553 days ago
> if your’e thinking about developing a real, production-ready, data science project in ruby - don’t.

Why is this? Are there inherent limitations in the language that has Ruby taking a backseat to Python in ML and maths/statistical applications? Or is it just due to neglect by the community?

(I started leaning ruby about ten months ago and have only just started to gain some proficiency, so I do not know much about the workings of the ecosystem - or the innards of ruby for that matter)

1 comments

I'm not sure if there are inherent language limitations (not that much of an expert), but know there's more momentum around Python for ML/data science work, mainly as a result of a few good resources specifically for it, which has encouraged more libraries and developer support to focus on it.

So perhaps less about neglect from the Ruby community, and more proactive-ness from the Python one.

I think the key factor here has been numpy, the scientific library for python. Academics used python because of it, and they are the ones who wrote the neural networks tooling.

We probably can expect to see implementations in all languages at some point. Floating point errors are not even that a big deal since we're dealing with statistics anyway.

That being said, neural networks are very resource/computation heavy. I wrote one in golang and cut my execution time in half just by encoding my matrices as flat arrays instead of two dimensional arrays. If ruby is to be used to build neural networks, it will need to perform the big work in a binary binding, like tensorflow does with its C++ layer.

I doubt that anybody is thinking of actually running the networks in ruby, or python, or even C on the GPU. They're all run on the GPU anyway.