| HN Mirror

i appreciate you sharing all that, but it seems like we might be on similar levels of knowledge/experience. i've been a dev who does a lot data engineering for 5 years. i'm looking more for best practices and theory about designing the pipeline, how to arrange the order of operations, how to separate each step, logging practices, how to make it reproducible, how to restart when it fails halfway in without going back to the beginning, how many retries, what to do if a step gets stuck in failed state, how to flag that bad data, etc. so. many. questions. while i build these pipelines.

i have figured out these questions by seeing how more experienced devs do it and on my own, but i want to learn from a book or video series because you can only figure out so much yourself, eventually you need to seek out experts and sometimes the experts around you also figured it out themselves and you need to find an expert outside of your circle. unfortunately a lot of the "ETL experts" teaching stuff online are trying to sell me on prefect or airflow or snowflake etc