Spark has improved our ETL jobs by orders of magnitude, both with respect to performance and ability to engage our workforce (mostly Python programmers).
Previous tools that improved workflow: docker, nginx.
Feature engineering. Transfers about 3.5b records into features that go into a variety of models. Previously was a hadoop streaming job (~40 hours); now about 6.