Hacker News new | ask | show | jobs
by shazzdeeds 1237 days ago
A few quick tips to help slow Spark tests.

1. Make sure you're using a shared session between the test suite. So that spin-up only has to occur once per suite and not per test. This has the drawback of not allowing dataframe name reuse across tests, but who cares.

2. If you have any kind of magic-number N in the SparkSQL or dataframe calls (e.g coalesce(N), repartition(N)) change N to be parameterized, and set it to 1 for the test.

3. Make sure the master doesn't have any more than 'local[2]' set. Or less depending on your workstation.