|
|
|
|
|
by runT1ME
1990 days ago
|
|
I've been approached about various data engineering jobs over the last couple years and the job descriptions have varied wildly. It has been everything from: 1. SQL/analytics wizard, capable of building out dashboards and quickly finding insights in structured data. Oracle/MSSQL/PostGres etc. Maybe even capable of FE development. 2. Pipeline expert, capable of building out data pipelines for transforming data, Flink, Spark, Beam on top of Kafka/Kinesis/Pubsub run from an orchestration engine like Airflow. Even this could span from using mostly pre-built tools wiring together things with a bit of python to move data from A to B, to the other exteme of full fledge Scala engineer writing complex applications that run on these pipelines. 3. Writing infrastructure software for big data pipelines, customizing Spark/Beam/Flink/Kafka and/or writing custom big data tools when out of the box solutions don't work or scale. Some overlap with 2, but really distinguished by it being a full fledged software engineer specializing in the big data ecosystem. So, are all three of these appropriate to call Data Engineer? Is it mainly #1 and people are getting confused? I would certainly fall into the #3, so I'm always surprised when people approach me about 'SQL transform' type jobs. |
|