| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by EdwardDiego 4112 days ago

> It turns out that a huge fraction of Spark workloads fall into this model, especially since we support complex types and nested structures.

The first step of all my Spark tasks is "turn this RDD[String] into an RDD of parsed JSON", or turning CSV into case classes.

What JSON parser will dataframes be using? I presume Jackson?