| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nemothekid 2620 days ago
	Same, I recently moved an ML pipeline from PySpark to pure Python because of the debug-ability issue. The data science team, who managed the project, were experts in Python but relatively weak in Scala/Java. There were many issues were an improper data type may blow up pickling in the Java side and return absolutely cryptic errors. It was also difficult to do any sort of integration test and profiling on the code - the start of moving off of Spark originally started as a way to do integration testing and profiling.

1 comments

spydum 2619 days ago

Indeed the real sad part is you can’t lead teams there early (premature optimization). Everybody seems to make the same rough transition on their own.