|
|
|
|
|
by sammyd56
1270 days ago
|
|
What is your goal? For short-term career growth, $YOUR_COMPANY's current preferred ETL tool will have the biggest ROI. Focus on design patterns: while APIs will come and go, the concepts, as you rightly say, are transferrable. If you're looking to land a new role: the market says dbt, databricks and snowflake are pretty strong bets. If it's personal interest, or a high-risk, high-reward long term play, take your pick from any of the new hotness! |
|
I'd add that dbt, databricks and snowflake are pretty strong bets still, but you have to acknowledge that they're becoming mainstream with an ever accelerating pace as the companies behind them churn out upskilling courses, meetups and acquire an ever larger share of the market.
If you like to be a specialist, going deep into either of those still holds career value.
If you're taking a more generalist view of where things are headed, the best prediction I heard someone say to set themselves apart is for Data Engineers to optimize for operationalizing data. Focusing much more on reverse ETL, becoming knowledgeable in building data web apps. The no-code or low-code movement around data apps will make the barrier of entry to set something up nonexistent, and I see how that will drive demand.
Pairing (big) data query/ frontend performance and web apps is another beast though.
For all my initial scepticism, I see the Data Mesh concept picking up pace in the years to come. It's vendor independent, couples well with Team Topologies and effective, decoupled, functional SWE teams. There still will be a big need for standards and conventions set by a small enabling core DE team, as of now, the knowledge gap between the baseline DE and your average SWE or Product Owner is just way too big in my experience.
Last but not least, I'd throw data lake out there. Apache Iceberg is getting a lot of attention and rightfully so. TCO of a query engine on top of files is so much better than any DWH and any org being able to optimize compute on data for it's current need will be able to save massively while the "convenience" gap steadily closes. Again, pretty generic but there's much to learn around Athena, Trino and the like.
I'm personally not a fan of learning a new language except maybe for Rust. There is an ever increasing stack of standard "low-code" tools for the typical ETL schlick, and Python won't go anywhere. Again, potential to differentiate will be low and ever lower in many contexts outside of proper big data. This is only me though and this view is highly context dependent, so YMMV of course.