Hacker News new | ask | show | jobs
by achompas 3572 days ago
My official title is "Data Scientist" although I'm closer to the "ML Engineer" someone else mentions in a child comment.

Frankly speaking, if your company doesn't need a data engineer, it won't hire one or move you into that role. They likely don't, either, if you're experiencing this pushback -- data engineers often develop ETL pipelines or data warehouses, both of which are very useful if your company has a data team and very useless if it does not.

That said, you may want to move closer to my role. There's actually a shortage of data-savvy people who can also write production software, and you would nicely complement a more research-inclined data scientist or analyst -- someone with far more experience with research/analysis than development.

3 comments

> There's actually a shortage of data-savvy people who can also write production software, and you would nicely complement a more research-inclined data scientist or analyst -- someone with far more experience with research/analysis than development.

I experience the same problem with shortage-at-price-X in the field you describe. I'm a machine learning engineer with experience in MCMC methods, but I also have a lot of low-level Python and Cython experience, some intermediate experience with database internals, and lots of experience writing well-crafted code for production systems.

There are basically zero companies willing to pay what I'm seeking (which is a salary based on my previous job and a few offers I got around the time I took that job). In fact, in some of the more expensive cities, the real wage offered is far lower than other markets.

I've seen reputable, multi-billion dollar companies offering in the $140k range for this type of role in New York. That's wildly below anything reasonable for this sort of thing in New York. I've seen companies in Minneapolis offering $130k for the same kind of job -- and even that is still too low for Minneapolis! The same has been true in San Francisco as well.

Because these companies value you more for simply looking good on paper and looking good as a piece of office ornamentation when investors stroll through, and they view you as an arbitrary work receptacle closer to a software janitor than a statistical specialist, their whole mindset is about how to drive wage down.

Frankly, given the stresses of the job and the risk of burnout, I think it's actually a terrible time to be in the machine learning / computational stats employment field, despite all of the interesting new work and advances being made. The intellectual side is good, but the quality of jobs is through the floor.

"I've seen reputable, multi-billion dollar companies offering in the $140k range for this type of role in New York. That's wildly below anything reasonable for this sort of thing [in NY/SF"]

Man, do I ever agree. This is where the "shortage" argument falls apart.

This is why I'm so uninterested in the abstract arguments happening elsewhere on this topic about whether markets are failing and basic laws of supply and demand no longer apply at theoretical salary levels (10 million was offered as an example).

Why are we bothering with this debate, when it's so far from reality? I'd say that if you're trying to hire a very high skilled and critical tech worker in SF, and you just can't find one no matter how hard you try, and then I find out that you're only offering 140k a year?

In San Francisco and New York (and anywhere else in the US, really), that's nowhere close to the kind of pay where we should start scratching our heads about a shortage and start wondering why the usual laws of supply and demand aren't working anymore.

Yeah, I strongly believe companies haven't (or aren't willing to) figure(d) out the IC track problem for data people in the way they've figured it out for engineers. Part of me wonders if it even makes sense for them to figure it out, if they're not an Uber/Netflix/Amazon with a strong need for advanced ML abilities.

It sounds like you're a principal/lead/post-senior ML engineer; at that level, you can easily command more than $140k but you have fewer options to apply those skills at companies that really need them (because few companies actually need them).

I don't know. It's tough. I agree that it might be a terrible time to work in ML/computational stats because of stuff like this.

I suspect the reason is those companies offering $140k frankly don't need that level of expertise. With that kind of background it would be fairly easy to get 200-300k as an infrastructure engineer at a quant shop.
Oh, also: if you're in NYC I'd be happy to meet over a coffee/beer to swap stories. Feel free to use the contact info in my profile.
I think the company does need data engineers but wants someone with a graduate degree from Stanford or CMU in that position, even though the actual work is in building up infrastructure for those people. And I understand. I've only really got software engineering skills to contribute at this point and I'm picking up the ML from kaggles on the side; I am looking for a position that can increase my overlap between those, because learning at home while working on unrelated stuff is making me move slowly and painfully. Your experience sounds exactly like what I'm looking for - data-savvy writing production code, complementing a research-heavy team I can learn from. How did you get started in that?
I honestly fell into it by luck. I moved to NYC, studied machine learning in grad school, networked my ass off, and landed an internship.

From there I went full time as something of an ML engineer at a company with a strong tech culture, and learned as much as I could in both tech and ML/statistics. The rest is history (although I'm by no means a rockstar or whatever).

My path is hard to reproduce -- it starts with being in NYC or SF at a specific point in time, before the labor market became saturated with data science bootcamps and PhDs furiously learning Python while working on their dissertations.

Your best bet at this point is to produce a few data-related projects (maybe work on open source like scikit-learn and pandas?) and network like crazy. Someone somewhere will have a need for someone like you.

Thanks! I guess it's somewhat reassuring that it's hard to break into for everyone and I'm not just dumb :) I'll keep kaggin'
>There's actually a shortage of data-savvy people who can also write production software

Well no kidding, that's one person doing two jobs. That's easily a 5-10 year training time depending on how high a quality you demand from their production software.