Hacker News new | ask | show | jobs
by law 4932 days ago
I use Python for nearly all of my ETL processes that involve text processing. Even in production systems, I'd be hard-pressed to admit any significant performance issues. Python facilitates implementing algorithms in a functional style, which I tend to prefer over the imperative style (i.e., Java). With C++11 and boost, I'm able to translate my Python code to C++ while preserving the functional style, which has immensely simplified prototyping/deploying NLP/ML algorithms while simultaneously begetting enormous performance gains. I see Python as an extremely viable alternative to Java.
2 comments

You got me a bit confused here. If I understand correctly what you 're saying, you 're still using Python for prototyping the core algorithms and C++ in actual production systems. I'm not saying Python is not good for production systems in general, I'm wondering whether it is good enough for real-world implementations of machine learning algorithms.

Also, I believe most people would consider Java as an alternative to C++, hence all the Java-based Apache projects, such as Mahout, Solr etc.

I use Python in production for text pre-processing and other ETL-related processes, which is part of a larger reinforcement learning approach. Additionally, I use Python to prototype the core ML algorithms, which I sometimes re-implement in C++. However, for many of those algorithms, numpy actually performs identically to BLAS in C++.
I get it now, thanks. It's very interesting, maybe I will give Python for ML a chance!
Have you tried Scala? It might let you write in a functional style and then not have to translate it to something else. Please don't interpret this as a troll; I'm genuinely curious what the pros/cons of these approaches are.
I've never tried Scala, but I suppose I should give it a chance. I'm a fan of Lisp, and the two languages seem to have a lot in common. Scala's expressive type system seems like it has the potential to be both a blessing and a curse, but admittedly, I know next to nothing about the language.
I may be missing something here, but if you're a fan of lisp and want easy interaction with libraries on the JVM, please tell me you've heard of Clojure. It's a modern lisp that strongly favors functional programming, and that has great concurrency support. Plus, there is already a data analysis / statistical platform built on top of it called Incanter.