Hacker News new | ask | show | jobs
by devereaux 2578 days ago
In spreadsheets (excel) and notebooks (ipython/rstudio), having the data bundled with the code is a feature. It provides reproducibility.

> a destroyed formula can go undetected for a while...

That is a problem in methodology, not with the tools. There are many solutions that do not require abandoning spreadsheets.

For the destroyed formula example, you need tests. Simple example to do that with a checklist: if you are doing a SUM(), require that the employee ticks a box saying "all the number were highlighted when clicking on the SUM formula".

For the out-of-sync version, you need a central repository and another box "I retrieved the latest version from the xx repository, and this version was: ... "

Then require that to be printed and signed (accountability), and you'll see mistake disappear.

1 comments

Data bundled with the code is good, but code mixed with the data is a whole different thing. Notebooks are better than Excel in that regard I think.

As for expecting people to religiously and accurately observe procedures, I work with human beings, you seem luckier...

Is it any different with code? Developers don’t always stick to the procedures either.

I have ‘unit tests’ in my excel files on the last tab. Eg, the sum of all lines in the Data tab must be the same as the sum of the annual revenues in the dashboard tab.

With a bit of conditional formatting it’s easy to see if a test is failing as well.

I like programming for lots of things, but ask me to do a business case or some one-off analysis and I’d use Excel in most cases.

After they received training when they learn that this step is important and not just a formality, and why it is done, I found that people do observe procedures, especially when they have to sign the checklist they filled themselves.

Human beings observe procedures when they understand it's part of the job and they are held accountable.