|
|
|
|
|
by nemothekid
2620 days ago
|
|
Same, I recently moved an ML pipeline from PySpark to pure Python because of the debug-ability issue. The data science team, who managed the project, were experts in Python but relatively weak in Scala/Java. There were many issues were an improper data type may blow up pickling in the Java side and return absolutely cryptic errors. It was also difficult to do any sort of integration test and profiling on the code - the start of moving off of Spark originally started as a way to do integration testing and profiling. |
|