Hacker News new | ask | show | jobs
by maslam 2730 days ago
Karthik, I'm no Spark expert but almost all advice I read is to avoid UDFs if at all possible. Examples below:

- https://medium.com/teads-engineering/spark-performance-tunin... - https://www.inovex.de/blog/efficient-udafs-with-pyspark/

1 comments

Thank you for those pointers.

There are definitely some differences between the kind of UDFs that Spark supports and the kind that Froid handles. For one, Spark UDFs cannot invoke a Spark SQL query in their definition AFAIK, whereas TSQL functions can. But still, some techniques might be applicable. Definitely worth digging further!