Hacker News new | ask | show | jobs
by adekok 3132 days ago
Excel shouldn't be used for much other than trivial things:

http://www.sciencemag.org/news/2016/08/one-five-genetics-pap...

https://www.economist.com/blogs/graphicdetail/2016/09/daily-...

he authors found that Microsoft Excel would often interpret “SEPT2”, which corresponds to the gene Septin 2, as “September 2nd”. The programme also tended to mistake identification codes like “2310009E13” for numbers in scientific notation—in this particular instance, the code would be read as 2.310009 times 101

Excel is fine for the home user. But the implicit conversions of input data can play havoc with any complex analysis.

Plus, it's limited precision can cause errors such as subtractive cancellation, etc. In order to correctly calculate complex formulae, the calculations must be done with an understanding of the limitations of the computers.

e.g. you don't calculate (a^2-b^2) for large 'a' and 'b'. Instead, you calculate (a+b)*(a-b). That has the same mathematical result, but is not affected by subtractive cancellation.

3 comments

    > Excel shouldn't be used for much other than trivial things
Good luck with that, but it ain't gonna happen anytime soon.

Many "non-trivial" applications got their start at someone's desk in the form of a shitty excel spreadsheet with horrific macros. Is it optimal for "complex analysis"? No, but it doesn't matter because most things are computationally simple. The hard part is the intricate business logic and the timeliness of getting results and these things are not "trivial" at all.

Instead of complaining about how terrible excel is, the community here should be providing alternatives. And no, engaging a software team/consultant for a million dollars to develop bespoke applications or interfaces to enterprise systems for every little project isn't a viable solution for people who need to get stuff done pronto.

A real alternative to Excel is no small feat.

I agree with that. Modern software development has version control, unit tests, package managers, etc. Ideally we get support for some of those concepts in something that feels like a spreadsheet to the end user. (I'm not sure if it could ever be retrofitted to Excel)
As others have said, you can get version control with Excel by saving the file the SharePoint.

Some huge financial spreadsheets have sanity checks computations. With conditional coloring (turn this cell red if the sanity check fails) these can feel a lot like unit tests. However, these are mostly for business logic - so it's hard to see how a vendor could provide them.

In other words, it's not that excel doesn't have version control and unit tests - it's that the people who wrote excel spreadsheet sheets don't understand version control or unit tests.

> Instead of complaining about how terrible excel is, the community here should be providing alternatives.

Yeah... like I'm going to compete with MS by writing my own Excel replacement. That just isn't realistic.

What would be realistic is for MS to acknowledge that a large proportion of their customers use Excel for a particular purpose. And then tailor their software to the needs of their customer base.

But why would MS care? The researchers already bought Excel. So why "fix" it?

On top of that, no one is aware the the published papers are crap.

The real solution would be for journals to deny publication of papers based on shitty Excel analysis.

This is what screwed me over as a herbarium curator (a herbarium is a type of natural history museum for plants):

Excel completely corrupts dates before the year 1900. It just boggles the mind -- if your spreadsheet has dates for 10,000 years from now that works fine, but if there are any dates from the 19th century or earlier, it can cause big problems.

>Excel is fine for the home user. But the implicit conversions of input data can play havoc with any complex analysis.

Can you not turn all of that off (and only need to turn it off once per sheet/file)?

Not really. With care, you can usually get the format right temporarily, but it often changes for non-obvious reasons and there's no way to see that the format is wrong until it mangles your data.

In general, one of the worst things about Excel is there's no obvious way to lock anything to keep it from getting changed by accident.

Tell that to tens of thousands of researchers.

Heck, even integration has been re-discovered and published:

https://fliptomato.wordpress.com/2007/03/19/medical-research...

If the tool doesn't work correctly by default. Then it's likely the wrong tool for the job.

The default conversions are fine for 99% of use-cases. But no one should mistake Excel for a robust data analysis tool.

Preferences have become a pretty natural part of using a computer, I'm not sure why we can't rely on users trying that when they run into problems.

I would be worried about endless forking of defaults as a feature of newly built tools, leading to a bunch of new tools that will be harder to maintain.

If they can't manage implicit data casting preferences (if those exist, they may not), surely handing them something more complicated like R, MATLAB, or SQL is even worse.

Interesting little article. I just read about Jeff Bezos initiative to push for all departments at Amazon to be accessible as service interfaces. Could there be real interdisciplinary growth by having academic fields concepts and data accessible in the same way? Maybe starting with papers. But then if every model developed was some kind of living running service in the cloud or hosted on .edu you could query or interact with