Hacker News new | ask | show | jobs
by pcmoritz 3164 days ago
For Ray, the main use case at the moment is parallel/distributing machine learning algorithms, people have been using it for parallelizing MC(MC) style applications, doing hyperparameter search, (pre-)process data, we are using it for reinforcement learning (and have a library for that, see http://ray.readthedocs.io/en/latest/rllib.html)

More broadly it is useful for many parallel/distributed Python applications where low latency (~1ms) and high throughput of tasks are a requirement.

1 comments

Python is almost synonymous with shitty performance in my mind; am I just wrong about typical python performance, or are you doing something special to make this of a non-issue (e.g. the way numpy essentially shoves all the work into C), or is the flexibility Ray affords worth the performance penalty for your users?
There are two considerations here.

(1) Python single threaded performance: Here, most of the libraries we are using are implemented in C++ (like numpy, TensorFlow, Cython to speed up the code, etc.). Ray is orthogonal to that.

(2) Python parallel performance: Here Python is mostly problematic because of its lack of support for threading (the GIL is one problem here); we handle this problem by using multiple processes and shared memory throughout. Efficient serialization makes this feasible.

The core of Ray is implemented in C++, so performance is not an issue for that; also all of the serialization is implemented in C++.