Hacker News new | ask | show | jobs
by _yosefk 4424 days ago
I agree of course, I just think a scientist taking a more thoughtful approach > a scientist taking a sloppy approach > a "software engineer" taking an overly thoughtful approach. Because the latter could have written ~200K LOC spread in 5 directories and you'd need a debugger to tell which piece of code calls which.
5 comments

I think you're comparing apples to oranges, both here and repeatedly in your original article.

For one thing, you describe many "sins" that "software engineers" commit, but in reality code that was flawed in most of those ways would not even have passed review and made it into the VCS at a lot of software shops, nor would any serious undergrad CS or SE course advocate using those practices as indiscriminately as you seem to be suggesting.

For another thing, how many "scientists taking a sloppy approach" do you actually know who can successfully build the equivalent of a ~200K LOC project at all, even if those 200K lines were over-engineered, over-abstract code that could have been done in 50K or 100K lines by better developers? It's one thing to say a scientist writing a one-page script to run some data through an analysis library and chart the output can get by without much programming skill, but something else to suggest that the guy building the analysis library itself could.

It's not that a single scientist writes it, but rather that someone publishes a paper on something, with ugly code used to prove it, and then becomes a professor. Subsequent generations of graduate students are tasked with extending / improving this existing codebase until it is basically Cthulu in C form. ;)

I recall reading a propulsion simulation's code developed in this way. "Written" in C++, initially by automated translation of the original Fortran code. Successive generations of graduate students had grafted on bits of stuff, but the core was basically translated Fortran, with a generous helping of cut-and-paste rather than methods for many things. (I don't mean this as an insult to Fortran: I've tremendous respect for its capabilities, and have read well-written code in that as well.)

The net result was that fixing bugs in the system was very challenging, as it was a very brittle black box. It was not Daily-WTF-worthy, but still very frightening. I'm very grateful I was not the one maintaining it. ;)

You must not have been in science or you'd have encountered the 200K LOC program, written in five programming languages (two of them obscure), which can only be compiled on the author's computer. Oh, and add 50K of C code from ancient versions of other projects (which could've been used as libraries) for undocumented reasons.

Though, I have also had colleagues who were also brilliant programmers.

This describes almost every published application I have ever tried to get running. It ends up being impossible to get the application working on anything other than the authors workstation.
I would alter your list to say that a competent software engineer working together with a scientist > a scientist taking a thoughtful approach > a sloppy scientist > someone who is neither a competent software engineer nor a thoughtful scientist.

From the article and your comment above, it sounds to me like you have had to work with a terrible programmer who ranted about best practices to cover for his incompetence. We've all worked with someone like that, even in software shops. Don't tar us all with that brush.

I think it's a pretty shoddy software engineer who writes more LOC than the scientist. Good code is concise, readable without comments, etc. Bad software engineers write bad code is no different than a bad scientist reasoning that the sun is cold because the temperature in January is below freezing.
What's really interesting here is comparing the two lists of problems the author gives.

On one hand, the problems are either product defects (crashes, missing files, etc.) or maintainability defects (globals, bad names, obscure clever libraries, etc.).

On the other hand, the problems the author mentions are basically things anathema to snowflake programmers (files spread all over, deep hierarchies, "grep-defeating techniques", etc.)

The academic's code scales vertically, because you can always (hah!) find some really bright researcher who is smart enough to grok the code and spend all the time in valgrind and whatnot to make it work. However, God help you if you can't find (or, more appropriately given the current academic culture, force) somebody to waste many hours of their lives fixing mudball code.

The other extreme scales horizontally, right? You have these many files, and deep hierarchies, and dynamic loading, but that's how a lot of people are used to doing it and that's what the tooling is designed to support. The big accomplishment of Java and C# isn't that it lets you get a 100x return from a 50x programmer, but that it lets you scale to having 50-100 programmers in a semi-reasonable way on a project.

In an ideal world, you have a small number of academics and engineers that communicate tightly and write good, compact, and clean code; in the real world, you want to pick tools that help you deal with the fact that it is hard to scale vertically.

EDIT:

At second read-through, I think the author just needs to use better tools. A good IDE makes code discovery much easier than mere grep, and helps solve a lot of other problems.

I do not understand the insistence of academics on using unfriendly tools.

> I do not understand the insistence of academics on using unfriendly tools.

My step father teaches doctorate business students. Until VERY recently he was running Corel Wordperfect simply because it was the first word processor he had installed. Never underestimate the potential stubbornness of smart people :)