|
|
|
|
|
by extr
1092 days ago
|
|
Very interesting. I just worked to implement a baby version of this kind of system at work. Similar to this project, our basic use case was allowing researchers to quickly/easily execute their arbitrary R&D code on cloud resources. It's difficult to know in advance what they might be doing, and we wanted to avoid a situation where they are pushing a docker container or submitting a file every time they change something. So we made it possible for them to "just" ship a single class/function without leaving their local interactive environment. I see from looking at the source here, run.house is using the same approach of cloudpickling the function. That works, but one struggle we are having is it's quite brittle. It's all gravy assuming everyone is operating in perfectly fresh environments that mirror the cluster, but this is rarely the case. Even subtle changes in the execution environment locally can produce segfaults when run on the server. Very hard to debug. The code here looks a lot more mature, so I'm assuming this is more robust than what we have. But would be curious if the developers have run into similar challenges. |
|
In fact we totally agree and are not cloudpickling the function because of the package minor version issues. We sync over the code to the destination environment and the server imports it fresh, which is much more robust. The one piece of code that cloudpickles functions is a trap door for certain weird situations, but frankly we haven't had to use it in months.