|
I don't really agree with the reasons given, even though my conclusions are the same. The main reason why research code becomes a tangled mess is due to the intrinsic nature of research. It is highly iterative work where assumptions keep being broken and reformed depending on what you are testing and working on at any given time. Moreover, you have no idea on advance where your experiments are going to take you, thus giving no opportunity to structure the code in advance so it is easy to change. To make a concrete example, imagine writing an application where requirements changed unpredictably every day, and where the scope of those changes is unbounded. The closest to "orderly" I think research code can become would be akin to Enterprise style coding, where literally everything is an interface and all implementation details can be changed in all possible ways. We already know how those codebases tend to end.. |
If the problem was only unpredictability, then projects with a clear and defined end goal (eg, a website to host results) would be of substantially higher quality. But they’re not. Well defined projects tend to end up basically just as crappy as exploratory projects.
The problem is evaluation and incentives. There’s literally no evaluation of software or software development capability in the industry. I know of a researcher that held a multimillion dollar informatics grant for 3 years. In that 3 years they literally did nothing except collect money. Usually there are grant updating mechanisms, and reports, but he bsed his way through that knowing there’s a 0.0000000% chance that any granting agency is going to look through his code. The fraud was only found because he got fired for unrelated activities.
I once looked up older web projects on a grant. 4/6 were completely offline less than 2 years after their grants completed. For 2 of those 4, it’s unclear whether the site ever completed in the first place.