Hacker News new | ask | show | jobs
Ask HN: How do I choose between software engineering and data science?
45 points by khannate 2863 days ago
I'm applying for internships for summer 2019, and as the title suggests, am unsure of whether to apply for swe or data science positions. My background and skill set are mostly in statistics and machine learning, so I'm inclined to apply for data science, but many of the postings I've looked at list a PhD as a minimum requirement, and I am an undergrad. On the flip side, there are lots of swe postings looking for undergraduates, but they expect knowledge of data structures/algorithms and programming languages that I don't have (I'm not a CS major).

I think I'm a reasonably strong applicant, but am unsure how to navigate between the Scylla of having the wrong degree and Charybdis of having the wrong background. Any advice would be much appreciated!

22 comments

Apply, apply, apply, apply! Don't filter yourself out of jobs (Or anything in life, honestly!). Hiring is so hard that people barely know what they need or who will help. You show up, be honest, let them be honest in response, keep moving.

You're too young (I don't know your age, but young in the process) to be worried about which is perfect. Apply to both and whichever you get do it super well. Can't lose! Good luck!

Agreed.

Also be sure to explain in a customized cover letter why you would be a good match for each position you apply for.

Data science. In that field, your background of statistics and math will be used much more frequently. In SWE, quite honestly, it won't. In SWE, you'll be placed into some field comprising of either: front end/back end/web programming, infrastructure, DevOps, etc. Plus, currently there is less good competition in the field of DataScience. There is a long career runaway, and a few data leaders looking to grow the next batch of careers. The competition is much higher in SWE, as the field is well developed, and often well trodden. SWE is now in the "era of efficiency" where well developed best practices, processes, etc are developing/ in place. Data Science has much less of that already in place, and exciting ares around future of data, privacy, volume of data, etc.
> there is less good competition in the field of DataScience

In my experience, it is probably easier to differentiate yourself and proof your worth by producing great products in software engineering.

Data science is overrun at the moment by everyone chasing the hype. So it is kind of hard to proof your worth by producing great data science, you just will not be heard among all the shouting of people, sub-orgs and consultants trying to sell their latest deep learning model for a data-set that would fit on a floppy disk (okay, that was exagerated, a zip drive).

Rather than thinking of Data Science and SWE as two different fields, think of Data Science on a spectrum, with "Advanced Data Analyst" on one side and "SWE/Machine Learning Algorithm Engineer" at the other.

Data Science is a weird field. A lot of the jobs descriptions have similar keywords, but there is just a huge amount of variance in what the job requires. There are definitely a large number of Data Science roles where solving a business problem requires you to write a good amount of code for integrating with other systems, data ET(maybe L), building UIs, etc that really is about making the core algorithm consumable by business owners.

When you interview ask what the day to day of somebody in that role is doing. You'll be able to figure out fairly quickly where they fall on this spectrum. Find the one that fits what you want.

IME, at smaller companies they don't have enough people to have 4 people (a Data Scientist, a Data Engineer, a SWE, and a Business Expert) just to get a data science project from conception to production. That's all done by one person with help from a business expert.

Besides your level of interest in each field, it’s important to note that at the top tech companies, software engineers typically receive 2x the RSUs as data scientists / data engineers for a given level. This is non-negligible and can range from an extra $30k to $100k per year depending on your level (base and bonus are typically the same though). Add in the time opportunity cost of getting a PhD, and a top software engineer can save up considerably more money by age 30 than a data scientist.

Money certainly isn’t everything, but I’m considering switching to software engineering (from data science) because I would like to reach financial independence more quickly than my current trajectory allows.

maybe it's dependant on country but where I am in europe this is drastically untrue, and data scientists are paid (including stock) far more than software engineers.

It's also worth noting that although every position I've applied for has asked for a PhD my one year masters has sufficed in every case.

Really? That's very interesting. I'm at one of F/N/G in the Bay Area. What sort of compensation do these companies in Europe offer?
depends, but 6 figures seems pretty normal if you're good but not in charge of anything in particular? dev salaries are lower here, maybe? but I started in data on the same salary that my been-devving-6-years girlfriend is on now after 3 raises.
My suggestion is: Don't do the HR job. Basically, if there is some overlapping between your skills and some of the requirements you should apply anyway. The HR people will decide if you're suitable or not for the job.
As someone that did my internship in Data Science, if I were to go back and do it again, I would choose Software Engineering instead.

Don't get me wrong, I really liked my position and my team, and loved what I did everyday. Career-wise though, I would consider Software Engineering better, unless you plan on doing a Masters/PHD right after undergrad.

Data Science is a much younger field than Software Engineering. While there is a ton of room to grow, it also means there aren't good hiring practices in place. Companies are way more conservative about hiring Data Scientists than Software Engineers. There usually aren't the same kinds of "coding challenges" as for engineers. While that sounds like a good thing, it means that companies have to filter out candidates some other way. In most cases, (good) companies filter out candidates by looking only at applicants with a graduate degree or with >3 years of experience. This makes it a very tough field to break into without already having experience.

I am actually planning on doing a PhD right after my undergrad. My issue is with what to do before then.
If you’re set on the (data science, I presume) PhD, I would suggest a SE internship. The rationale is that you’re going to be deep in DS for a few years so this is a good opportunity to explore another field. And what you learn during the internship (how to write clean code and document it, unit tests, version control, seeing production software) will a) put you in good stead for your PhD which presumably will be code-intensive and b) set you apart from the rest of the data science pack once you graduate. I work with (junior / intern, to be fair) data scientists and OMG, bashing together a Jupyter notebook != knowing how to program. How to get from a trained model to production code is in my opinion a vastly underdeveloped topic in data science. Having SE experience will definitely help you see how the other half lives.
At this point, get whatever experience you can to determine what you would like, and frankly, the contacts you make will be as helpful as the experience. It's an internship, you have time, and at a small enough future employer, you may have to cover multiple roles.

Yes, software pays more now, and data science (let alone data engineering) is still maturing and figuring itself out, especially at junior levels. Your current background will help with data science, but doing software for an internship won't hurt you in the future if you want to do data science.

Having a programming background helps with data work, some of which is programming directly and indirectly to talk to data engineers/software engineers, and vice versa, a math and data background is super helpful in attacking software problems in many areas.

If you feel like picking the wrong thing now at a young age will scar you forever, you're doing it wrong. In this whole industry, things change constantly, and you will have to reinvent yourself and learn with it. If you don't like it, you can always switch specialties, or even generalize a little more broadly.

Reference: I've been in IT for the better part of 20 years, much of that as a web development generalist, and now I'm doing data engineering. ~75% of the skills overlap.

I wouldn't worry at this stage. History isn't destiny and if you try something you don't enjoy, you've learned from the experience.
At many DS jobs you will be working closely with developers

Where I work we prefer DS canidates that have some SE background and could comfortably deploy a model (even if it's just to heroku)

We pass on a lot of really smart DS applicants who havent had the SE xp, simply because many of them would take a lot longer to get up to speed

This is just one data point, and some DS jobs probably wont require the SE xp, but hope it helps & best of luck!

I personally think that even if you want to be good at data science, you should do at least 2 to 5 years of software engineering first. So many data science people are basically throttled to death in what they can achieve because of their subpar software skills. Software is a multiplier of your math knowledge.

Source, I have a math masters (statistics, ML).

I'm going to take this in a different direction and say: That's not the important question.

The important question is whether you're interested in what the company and specific team are doing. Example: I once interned at Google, on the Chrome team. I mean, it's GOOGLE, free food and wonderful smart people and again it is GOOGLE and I'm an UNDERGRAD! What I learned that summer was that I don't actually care about web browsers at all. And so my internship was kind of a bust just because I didn't care much about what we were doing. I had no drive to stay at the office late to keep grinding away at the problem.

What motivates you? What really interests you? Whether you're doing data science or software engineering, that will be far more important to you having a successful internship.

The answer to that question is the same as, what do you like spending much more time on, machine learning and statistics or software engineering. Good career trajectories often start with answering that question. Talk to the hiring manager about your skills and why you want that data science position. It will be difficult but you will find a data science team, which will not make a decision based on the fact whether you have a PhD are not. PhDs are just an extreme oversimplification of data science skills, and there is hardly any other possible degree example that would qualify, CS undergrad is already taken for SWE positions like you are stating. Don't let these oversimplication of role requirements bog you down, it is made for HR people not you.
Build a model to predict which positions you have more chances to pass. Build an algorithm to automatically apply for the jobs.

After it fails miserably, if you blame the model, then you should become a software engineering. If you blame the algorithm, then you should become a data scientist.

I started out as a Software Engineer, then switched to Data Scientist. However I have decided to switch back to Software Engineer again now, simply because I enjoy the work more. So I would advise to try get into the nitty gritty work of both, preferably through internships, and then decide which route you prefer.

I would also suggest to apply for a lot of them anyway, there's not enough skilled and experienced people to fill all positions right now, so some of them will recruit more junior people than they might be looking for at first.

Real world programming does not involve much data structures or algorithms, much of the time. So lots of SWE internships you'll do just fine as long as you know how to code.
Software Engineering.

Data Science is a useful skillset for everyone to have, but the majority of the work in any practical data science role is in getting the data in the right place, in the right format to do the data science. This makes it such that most small companies can't actually support having a full time data scientist who can't also write code.

You have a good background in stats and ML - use that with practical experience in SWE to make your skillset more useful and broadly applicable.

> but they expect knowledge of data structures/algorithms and programming languages that I don't have (I'm not a CS major)

What about not being a CS major prevents you from picking up Sedgewick or a programming language reference?

Nothing. What does prevent me is the timeline on which I'm applying and other things I have to do in that same stretch. I'm hoping to do exactly this sometime in the next two or three years, but I don't think it'll be an option for this application cycle.
Try for both. Only when you get offers for both do you really have to choose.
Sounds like data science for you. But I'm curious: What is your academic background exactly? Where are you learning machine learning without programming?
Sorry for the confusion - it's not that I'm learning machine learning without programming, but rather that most of the programming I know I learned in the context of machine learning. In particular, I'm only familiar with the small portion of the standard undergrad cs curriculum that's relevant to those things.
Okay then, data science is probably a better fit over generic software engineering.

As you surmised, the latter is more focused on algorithms and data structures as the basis for solving problems. Your gut response is good. Go with your gut.

> Where are you learning machine learning without programming?

A math degree. You're confusing data engineering and data science. There are plenty of people who work on theory and do little to no programming.

Well, given that I made a statement about machine learning and not data science, the point still stands.

Machine learning is a CS field. It emerged out of CS. Any claims to the contrary are hokey revisionism. As to what "data science" entails, that's become a super loaded buzzword, so I'm not even sure where to begin. And "data engineering," please don't even. Just fancy terms for statistics and discrete math.

I define data engineering as something like "implementing ML algorithms on servers for real-world use cases", in which case they're mostly just gluing together function calls that other people figured out. "Data science" on the other hand is the stuff that actually requires using statistics and math to figure out what operations are necessary on a data set.

Plus, I could say the same thing about ML. It's just graph theory, linear algebra and calculus with some statistics mixed in. Where's the absolutely necessary programming? There are plenty of opportunities to do ML theory with little programming, if any. There absolutely needs to be a distinction between theorists and engineers, because they aren't the same thing. Most of the programming is the grunt work you pass off to the engineers.

You're an undergrad, and an intern, and now is the time for learning. Do whatever you know less about and learn it. This is no time to specialize.
You don't need a PhD to do data science.

It sounds like data science is more in line with your experience and interests.

Data science