Hacker News new | ask | show | jobs
by dwpdwpdwpdwpdwp 3393 days ago
I'm surprised people bother spending so much energy looking for someone who is both a statistician and a computer scientist knowing they are so rare. There are so many more statisticians who can at least communicate and work effectively with developers and vice versa. Why not just compose a team? I feel like just like other professionals have assistants, statisticians should have them too, and they'd be focused on the computer science and deployment of the applied statistics.
3 comments

This is a classic problem that shows up equally with lots of related areas: numerical work, statistics, ML, signal processing etc.

"just compose a team" sounds easy, doesn't it? Unfortunately there are lots of failure modes involving different parts of the team not really understanding what each other are trying to do, let alone what they are doing, and subtle errors getting by people who don't know what to look for. So, you can find such teams and some of them work well but a lot of them don't.

So an alternate is to try and find or create domain experts who mix all the appropriate skills, but this is hard and in the extreme case involves chasing down unicorns.

Companies and industries flop back and forth between preferring different approaches - right now a lot of people are talking about "data scientists" as one of the latter, but it will likely change over time as it always does.

It's a hard problem, and it shows.

As an engineer, I usually know better than to use the phrase I hate to hear: "why don't you just..."
Surely "Why don't you just... ?" is an exceptionally good phrase to use. In practice people mean "Just do ... !", which is very different. The why question, however, gets to the heart of an issue, it's a short hand for "The obvious solution appears to be that ... but I imagine you tried that and have a reason not to do things that way, what are those reasons?". It's a direct learning-centred enquiry that ekes out the kernel of complexity of a situation relying on the wisdom of the person it's aimed at.

So why don't you just use the phrase "Why don't you just ...?"?

I couldn't agree more. I just got hired to 'productionize' some proof of concept developed by data scientists in jupyter notebooks. The first thing I did was hiring a Python developer (no data science experience) to start cleaning up the code and a devops to put the infrastructure in place. Second step: I went to the data science department and sat down with them and I taught them how to program properly: test driven development, version control, code reviews (git pull requests) and continuous integration. They all have PhD's so it's not that they would have any trouble learning anything new. They thought it was great. Result: all their new code now directly goes (via code review and CI unit/end2end testing) into pre-production and after sign off from the product owners into production (quite often the same day). I just do not understand why instead of trying to find the perfect person for the job people don't just hire someone to teach them how to do the programming part of their job properly. Good teams are cross functional, diverse and have a strong focus on transferring knowledge.
This is a good question. I've tried both approaches, and currently favor going after the rare multi-skilled hire. In general, I have seen many cases where one person who has a small-medium amount of experience/ability in both is a lot more productive than two specialists.

> There are so many more statisticians who can at least communicate and work effectively with developers and vice versa.

Not in my experience. You need to design your data infrastructure to promote easy analysis, and you need to design your models to scale well according to the amount of data you're working with. There are also many cases where a project will require mostly engineering work for a while, and then mostly analysis/statistics work–there are ways to handle this with specialists of course, but there's generally a significant switching cost.

Also people with a combination of statistics & programming aren't that rare–IMO it's more that employers tend to search for both degrees, when instead you should be trying to evaluate the skills directly.