|
I took a similar route, albeit much less intentionally and spanning almost a decade. Software QA -> Business Intelligence Analyst -> Data Scientist -> Data Engineer. Here what I'd recommend today: 1. get very comfortable with Python. Scripting isn't enough, you'll need good OO principles, understand how to manage projects/libraries/dependencies, etc. This will take the longest, so start it first. 2. Read and re-read Designing Data-Intensive Applications by Kleppmann. This is the bible of data engineering and far outclasses anything else currently available. 3. Get your hands dirty with modern tools and the whole data lifecycle. DBT, Airflow, Snowflake, Postgres should be obvious (feel free to substitute prefect, clickhouse, etc. if desired). You'll also want familiarity with a cloud stack and how to manage it (terraform, pulumi, or CDK). A public portfolio project would be great, but being able to talk confidently about the how and why of these things is probably enough. The hard part is getting that next job. Look for junior roles at big companies, and mid-level roles at startups who don't understand the data ecosystem yet (almost any startup whose product is not ML or ELT). The former will give more mentorship, the latter will be easier to get if you can talk the talk in an interview. |
Could you please give more details why this is important? I have good experience with dealing with data, data science and little bit of data engineer too but I never saw the necessity for OO. I'm also very interested in data engineering and was wandering why you mentioned OO and why it is important for data engineering?
Thank you.