Hacker News new | ask | show | jobs
by bluescarni 1749 days ago
As a scientist myself, I strongly disagree. If you spend a non-negligible amount of your time telling a computer what to do, you are, indeed, a programmer. And as such you should be expected to become a decently proficient programmer.

Physicists are not mathematicians, and yet they are required to acquire a relatively high degree of proficiency in maths because maths is a fundamental tool in their job, and nobody would argue otherwise.

The attitude of considering programming a mundane craft to be picked up as-you-go is the main reason why the scientific software landscape is such a shitshow.

/rant

2 comments

As a fellow scientist I wholeheartedly agree! I find that programming skills (and willingness to improve on them) are strongly correlated to success as a (experimental) PhD student, much higher than mathematical abilities. The ability to automate your experiment or do a quick simulation is a huge productivity boost.

However, the programming education in science degrees is absolutely appalling. Just show them how program a newton raphson method in matlab (without any considerations for performance) and expect them to know how to program.

> If you spend a non-negligible amount of your time telling a computer what to do, you are, indeed, a programmer. And as such you should be expected to become a decently proficient programmer.

Just like if you are standing on your two legs most of the day and sprint once in a while, you can be considered a runner. Sure, you can play with semantics, but most people cannot run a marathon.

> The attitude of considering programming a mundane craft to be picked up as-you-go is the main reason why the scientific software landscape is such a shitshow.

Err... That's kinda my point?

> And as such you should be expected to become a decently proficient programmer.

It's very, very hard to be good in 2 different fields. Most people won't have the ability or the context to do so. Even if they did, the time and energy spent to do so would be taken from their main activity, which is why we employ them in the first place.

It's not reasonable to ask a data scientist, geographer, biologist or physicist to follow up with the right practices to deploy the latest sci-stack on a linux server, understand the trade off between GIL locked python thread, asyncio and multiprocessing or spell out what WSGI stands for.

Hell, I know a lot of professional programmers that don't know those things

> Physicists are not mathematicians, and yet they are required to acquire a relatively high degree of proficiency in maths

The quantity of information required to be learned is of one or two orders of magnitude, because the field of maths required to perform physics is quite stable, and well understood.

IT is a very young field, in constant flux. The scientific stack is a moving target, not to even mention the web one. Nobody can expect them to understand python, numpy, pandas, then a web framework, then css, then js, and html, probably some frameworks for them, a builder or two, how to deploy all that stuff in dev, in prod and architectural concerns for linking all that stuff.

That's crazy talk.

I think you are seriously overestimating the "programming proficiency" of many scientists. I don't think the OP meant that scientists should be experts in the web stack or even in the intricate details of the scientific stack. However, I do expect that they should know how to write reasonable maintainable code, i.e. use functions, modules, don't just copy paste code around between cells etc.. (this is seriously the state of much of the scientific programming world).

>The quantity of information required to be learned is of one or two orders of magnitude, because the field of maths required to perform physics is quite stable, and well understood.

Apart from the fact that some areas of physics are really at the forefront of maths, this also ignores the fact that learning the level of proficiency required for graduate work in physics is significantly more involved than learning about some best practices in programming.

If you think functions and modules as an example of what makes a code maintanable, then I'm afraid we won't be able to agree.

I've seen data scientists handling big code bases. The problem was not they couldn't use the language features. The problem is that they would be always lacking essential information for their mission because their is not enough time in a day for a regular human being.

They would put a md5 hashed password in their db, create an xml format to be reusable only to realize they'll need to hard code some value later, or have a gunicorn running to a crawl because they didn't know how to calibrate the number of workers.

It's just too many things to know. Once they mastered that, other things would come to bite them.

> If you think functions and modules as an example of what makes a code maintanable,

It’s definitely a part of it. This isn’t an all or nothing thing, one can learn good practices without encumbering their scientific work.

Shouldn't we just reduce the "data science profession" to what it clearly is then: shuffling around numbers and statistics in excel and python, in the hope of generating a useful insight or two and the occasional whitepaper as you go along?

If you don't understand the tools you're using, or the environment you're in - you're not any more of a "data scientist" than pretty much everybody else. My carpenter is a data scientist going by this logic.

But it is just an assumption. I work as a data scientist for 5+ years and from practical point of view, it is not just data wrangling. It is worth to mention that going through that logic we assume that programmer fully understand how to develop model in production and how to handle it in some border cases, which is not true.