Y
Hacker News
new
|
ask
|
show
|
jobs
by
Epa095
1270 days ago
There is also pandas udfs, which uses arrow as the exchange format. I assume it still has to copy the data (?), but it makes the (de)serializarion fast, and allows for vectorized operations.
https://spark.apache.org/docs/3.0.0/sql-pyspark-pandas-with-...