Hacker News new | ask | show | jobs
by dalailambda 2451 days ago
SQL has definitely become the defacto tool for a lot of data processing. This model of working is generally referred to as ELT as opposed to ETL.

For small/medium scale environments Fivetran/Stitch with Snowflake/BigQuery using getdbt.com for modelling is an insanely productive way to build an analytics stack. I consider this the default way of building a new data stack unless there's a very good reason not to.

For larger scales Facebook has Presto, Google has Dremel/Procella/others, and a lot of data processing is done using SQL as opposed to writing code.

The only downside really is that it tends to be fairly focussed on batch pipelines (which are fine for 95% of workloads). But even that is becoming less of an issue with Beam/Spark so you can use SQL for both batch and streaming.

Source: Solution Architect at an analytics consultancy.