Hacker News new | ask | show | jobs
by montereynack 2167 days ago
I’m biased as my work is primarily focused on large-scale distributed physics simulations, and incorporating machine learning into these. As a result, I treat ML very much as a means to an end.

Of course with the caveat that your situation is unique to you so I can’t give any definitive answers, I would think long and hard before jumping on the ML hype train. In my experience, it doesn’t pay to follow the trend; you’ve either gotta be first or you gotta be unique. Now that’s not to say that doing ML work work will only be restricted to a select few which you aren’t a part of, but myself and a few others are wary that the ML hype train (at least as far as deep learning is concerned) might be passing. The days of the AI labs paying million-dollar bonuses are nearly gone, unless (and someone can correct me if I’m wrong) you’ve got an alternate skill set they’re looking for. Of course, that doesn’t mean there aren’t plenty of people and businesses who would need CRUD-type ML setups; with your experience in databases I imagine that could be a unique angle to attack it from. Whether it’s a good idea to try and pivot into a career using ML really depends on your specific situation and the opportunities therein; to get more solid advice I would ask a trusted colleague or mentor, and would not consult people online, even if they are from HN.

For my PERSONAL opinion: I can’t speak to what is normally done in other parts of distributed systems, since scientific tools are usually bespoke and don’t use the same set of approaches as commercial products. However, just thinking about it from an outsiders POV, it seems to me like focusing more on distributes systems would be a winning combination. I don’t think computers will advance enough in the next 30 years that the need for distributed data and compute management skills will go away; hell with IoT you might be looking at a boom down that career path. From my perspective it’s only upside if you focus on expanding your skilllset in these areas; if ML continues to thrive there’ll most definitely be a need for distributed systems to run these models on. And if an AI winter hits, you’ll have a solid set of core skills to fall back on which I don’t imagine will go out if favor anytime soon. Those are just my two cents though, of course YMMV.

1 comments

Thank you for taking time to answer this!

I never knew that distributed physics simulations could be a career field. I always thought that such problems would be handled by scaling up vertically or just throwing a super computer at it.

If you don't mind, can you please elaborate a bit about the type of work that you do and scope of problems that you solve every day?

Thanks again!.

I can elaborate a bit. Most of the large-scale problems are actually as straightforward as “just throw a supercomputer at it”. However, just like when mathematicians say “that’s an implementation detail”, it turns out that actually throwing a supercomputer at the problem is much more difficult to do in practice than merely setting up some shell scripts to run, especially where scientific computing is concerned. For one, there’s usually no concept of “micro services” or “containerized” applications, at least not in my experience. Most of the modern distributed computing practices are actually thrown right out the window when it comes to scientific computing, since the scientists are going to be directly programming distribution schemes via MPI and stuff. The reason is because academic projects don’t have lots of money and need to efficiently use every dollar, and because most of the time distribution schemes really aren’t suitable for the science. You might have one layer where node interactions occur according to some mathematical and physical criteria instead of “load” or some other abstract flag, for example; that’s a bit harder to code for, and it’s much better to have a domain scientist who knows the physics deciding how to decompose the problem, instead of a computer scientist who has no idea of the physics adopting a scheme which ignores the problem entirely. Hence why I said most scientific tools are “bespoke”.

The result is that most of the distributed systems people are moved to a supporting role, where their job is to develop tooling and libraries to allow for better communication between nodes, for example. I’ve also heard of some compsci people being directly integrated into these scientific teams to develop specialized APIs and such in-house, but that’s a bit more rare imo. These are just some examples of how science and compsci intersect; for example here’s one group I know of: https://www.ornl.gov/group/dcs