Software engineers who have had to deal with spreadsheets being used inappropriately. Spreadsheets are an extremely powerful tool, but they have limits. Their accessibility is part of what makes them so ubiquitous, but it also means people get comfortable and start using them beyond their limits.
Spreadsheets (Excel ones in particular) are great and presenting tabular data and they're mediocre at most other things including crunching numbers, taking notes, storing computational data, providing user interfaces, and much more. It's often okay to use a sub-optimal tool, but you have to draw the line somewhere.
A recent HN post[0] highlighted an example where Excel was being used to keep track of large-scale contact tracing data for covid in the UK. Excel is a mediocre database. It has hard limits on cell sizes and row/column counts among other things. They ran into one of those limits and lost track of 16K positive cases because of it.
From my own experience, I've had to deal with repositories containing tens of thousands of Excel spreadsheets. They were used to capture verification data. Excel files are large and difficult to parse with scripts, which was bad enough. But the worst part was that Excel doesn't really have a syntax or schema, so users editing the spreadsheets would frequently create changes to the table layouts which would have to be accounted for as edge cases in scripts. I'd be lucky to even recover data from half the files using automation. I even encountered one Excel workbook with over 300 tabs!
Every tool has limits and it can be frustrating working with popular tools when users fail to recognize those limits. I love working with Python, but I'd never try to write a kernel with it. Likewise, there is a time and place for spreadsheets.
I do – but I'm an academic, not a software engineer. Spreadsheets are abused and don't make a formal distinction between analysis and data. I've seen many horrible, horrible things happen because people used Excel when they really, really shouldn't have. Intelligent biologists reinventing numerical integration (badly) in Excel, for example (with huge floating point errors, often comparable to the size of the change they are looking for biologically).
The post is probably something of a straw man. I think the "hate" is mostly about spreadsheets being used as effectively hard to audit spaghetti code for for tasks that would be better coded in some Python or R.
I've probably never been a real spreadsheet power user (though I've had some pretty big ones). But they're hard to beat for any sort of semi-structured tracking.
I sometimes wonder if spreadsheets as we know them were sort of an inevitable outcome of personal computers. There were some alternative takes early on but they never took off. It's also sort of interesting to me that some other tools in the same general space like databases on PCs sort of withered away.
It is a straw man. Software engineers don't hate spreadsheets when they're used as spreadsheets, like this article describes. They hate the "spreadsheet as a database," particularly when they're asked to load those spreadsheets into a proper database or other system periodically. Spreadsheets are so easy to use for tabular data, and since technical and non-technical people alike can easily use them, it's a tempting data transmission protocol. But most non-engineering types aren't disciplined about the layout, or understand the intricacies of interpreting the data (e.g. putting labels into numeric columns, adding accidental spaces to the end or beginning of labels), leaving the engineers to constantly rewrite sometimes complex scripts to load new data in this month's "flavor" of spreadsheet.
I do not think the point is software engineers hate spreadsheets. Software engineers would approach their use of spreadsheets very differently than a financial analyst. I think the point is software engineers hate inheriting spreadsheets that are developed by others who do not use a more sophisticated developer approach
Spreadsheets (Excel ones in particular) are great and presenting tabular data and they're mediocre at most other things including crunching numbers, taking notes, storing computational data, providing user interfaces, and much more. It's often okay to use a sub-optimal tool, but you have to draw the line somewhere.
A recent HN post[0] highlighted an example where Excel was being used to keep track of large-scale contact tracing data for covid in the UK. Excel is a mediocre database. It has hard limits on cell sizes and row/column counts among other things. They ran into one of those limits and lost track of 16K positive cases because of it.
From my own experience, I've had to deal with repositories containing tens of thousands of Excel spreadsheets. They were used to capture verification data. Excel files are large and difficult to parse with scripts, which was bad enough. But the worst part was that Excel doesn't really have a syntax or schema, so users editing the spreadsheets would frequently create changes to the table layouts which would have to be accounted for as edge cases in scripts. I'd be lucky to even recover data from half the files using automation. I even encountered one Excel workbook with over 300 tabs!
Every tool has limits and it can be frustrating working with popular tools when users fail to recognize those limits. I love working with Python, but I'd never try to write a kernel with it. Likewise, there is a time and place for spreadsheets.
[0] https://timharford.com/2021/07/the-tyranny-of-spreadsheets/