Looking over the resources, it doesn't seem worthwhile.
The difficult part is not learning how to code, or work with SQL. The hard part is learning the platform and tooling you need to operate at scale. The ecosystem is full of tools that are great for certain workloads, but terrible for others.
Your best bet is to start by getting an overview of the tools available for your team. If you're using AWS, GCP, or Azure, they each have data engineer-oriented certifications. So take a look at what tools those certification courses cover and start there.
If you are not in a cloud environment, take a look at Apache Airflow, Beam, Storm, or Hadoop. Most of the tooling provided by the big cloud providers is either a rip off of one these products, or is merely a hosted version (i.e, GCP Cloud Composer is managed Airflow).
This is a great reply! The greatest difficulty I face is convincing my data engineers to stop reinventing the wheel and leverage existing (and appropriate!) tools. Writing code to directly manage events and batch jobs should be a thing of the past by now. Pick a tool. Configure your jobs, retry policy, etc. and be done with it.
I really got a lot of use out of taking the GCP data engineer course on coursera (the one by google aimed at the cloud cert) and then later taking the actual certification.
With that being said it was very focused on BigQuery, and my impression is that it is that all their certs are now basically different variations of a kubernetes certification.
I used the Coursera course as well and yea, it's pretty good, so long as you do the labs.
To be fair though, BQ is the swiss army knife of GCP data engineering. I was a GCP DE consultant for many years and so much of the pipelines I put together amounted to, shove data into BQ as early as possible, then leverage SQL for transformations. Plus, most Google products have native BQ support (ads, GA, Youtube, etc), which makes it a must-have tool for a lot of companies.
We teach the difficult part at the Academy in a cloud agnostic fashion so you don't necessary have to suck up their marketing driven certis.
We don't see any new projects starting out with Airflow, Beam, Storm or Hadoop, so that's not a good choice for anyone, only if your plan is to keep horrible legacy stacks alive as a freelancer.
My title says Senior Data Engineer* and from what I can tell it looks like a collection of interesting things but nothing so revolutionary. You're better off identifying what aspects your team is weak in and seeking out experts or training in that specifically.
The difficult part is not learning how to code, or work with SQL. The hard part is learning the platform and tooling you need to operate at scale. The ecosystem is full of tools that are great for certain workloads, but terrible for others.
Your best bet is to start by getting an overview of the tools available for your team. If you're using AWS, GCP, or Azure, they each have data engineer-oriented certifications. So take a look at what tools those certification courses cover and start there.
If you are not in a cloud environment, take a look at Apache Airflow, Beam, Storm, or Hadoop. Most of the tooling provided by the big cloud providers is either a rip off of one these products, or is merely a hosted version (i.e, GCP Cloud Composer is managed Airflow).