Hacker News new | ask | show | jobs
by mrharrison 3568 days ago
So I guess you are a data engineer? What makes it fun for you? How do work with your customers to give them what they need in a timely matter? I would be interested to know what stack you use to go from dirty data to customer consumption.
1 comments

Closer to an aspiring data engineer, though I've done my fair share of ETL, cleaning, database building / rebuilding, admin. Prior jobs have been database engineer, probably closer to DBA.

I just enjoy working with raw data and raw code more than I enjoy writing something that launches a graphic. I enjoy writing a script that finds a bad piece of data, or a script that fixes up everything, or writing something that was once unable to run at all get converted to something that runs in 500ms. Perhaps it is that journey of constant discovery, and seeing that every situation is a unique little puzzle. It is seeing the world as it is with no one reinterpreting what the data means for me. I can explore it and discover what it really means. It is hollow truth, a mess of ideas converted to sets of ideas layered on sets of ideas, and when it is finally drawn down, converted, and passing all tests, it is self-evident and self-reflecting, and true. Hard to explain, but I suppose I like all the things people hate about it.

The tools matter about as much as it matters what CSS framework you are using. You have the ability to logic through UI and UX, whereas I do not. I have zero hope of ever doing well at what you do, since I simply don't have the foundation, but if it matters, I know most jobs I've applied to and worked at tend to be more ad hoc, using PL, Python, Ruby, etc.

I'm not comparing frontend to backend. I also think data is fun and I don't mean to be little the job, but in a real world scenario its detail intensive, under appreciated, tons of edge cases and extremely complex if you plan to make it scalable and fast. So if you are an aspiring data engineer be aware of these pitfalls, because the first couple times you do it you will think its fun to try something new and create some fun useful analytics, but customers will often complain at how long it takes and want more. It starts to wear away at ones drive and passion for data. Its not the data aspect its the job/deadline aspect.
You're getting very close to the root cause - customers and even colleagues don't really care about the work that goes into the data. They care about the end deliverable, because that's what creates value for them, and fairly so. That gets at why data engineering as a discipline isn't (IMHO) very well respected.

I know this isn't reddit, so I'll point you to reddit. Check out /r/datascience where those folks talk about what it takes to be a data scientist. Some folks are honest about data engineering, but most handwave past it, or talk about it like it's beneath them. Their role would not be possible without solid data engineering, rather than a complementary and equally important discipline. Good luck doing "data science" or "analytics" or "machine learning" or every other buzzword without clean data, and for us data engineers, good luck ever demonstrating value without the analytics folks working with us.

There's nothing aspiring about what you wrote. I think you're fine calling yourself a data engineer if those are the types of challenges you've been solving.

Don't sell yourself short or select yourself out of an opportunity (within reason). That's someone else's job!