Hacker News new | ask | show | jobs
by almosthere 494 days ago
Having used spark for the past 8 years or so, it's definitely a solid basic for data engineering. I use it for generating reports the most, but sometimes we have large projects to get data into different staging databases. I use it a lot with ElasticSearch or a parquet. Basically it helps you write large joins and flatten the result to a database that can more quickly perform aggregations on that flattened result (like Elasticsearch) or a columnar database.