Hacker News new | ask | show | jobs
by thiele 4749 days ago
There is low hanging fruit in the problems to solve though. For example, finding and highlighting outliers in a column. It's a relatively simple feature but yields big benefits to users.
2 comments

Except it isn't that simple. Your definition of an outlier may be completely different from some one else's (95th percentile, median/regression method, or simple max/min, some of these are already possible in excel). Then, how is the information displayed? Highlighting cells? Well if you have +10k rows of data, than that won't be too useful. Creating a summary in another column/sheet will add to the already cluttered 'results' spreadsheet, etc...

This is why I recommend using programming based tools for data analysis. Any non-programming tool has to find a balance between the number of features offered and the complexity/ease-of-use of the tool. With programming tools, you merely have to find the right package (or build your own) which essentially results in getting the exact set of features that you need to solve your problem.

I know people bag on VBA, but I'm a big advocate of effectively utilizing Excels tools along side using VBA to customize where necessary. This solves the problem in your first comment (users can define their own statistical limits and methods), while keeping the simplicity of excel and its tools for other users.

The problem is that not everyone is a programmer. A "programming" tool might be great for you, but when you show it to co-workers for them to work with, they'll inevitably ask "Can I get this in Excel?". Excel+VBA allows for customization when the "complexity/ease-of-use" balance is out of sync with your needs, but to everyone else it's still just Excel.

With this, those that can program have the option to solve the problem their way, while allowing those that don't program the means to solve the problem the way their used to.

I have used VBA here and there, and try not to bag on it too much. There are some cases where using it can make sense, but in general, it seems to be a compromise.

I get a little confused about the programmer/non-programmer dichotomy. If you are capable of implementing a complex model in excel, you are probably more than capable of learning a programming language. If you are just tallying up a few numbers to throw into a report or presentation, then yeah, no need to switch. As I mentioned in another post, it probably has to do with exposure and motivation.

Excel is a tool, and it's a really good tool for what it does, which is for applications that require: - very fast iteration cycles - a particular data model: grid of numbers which is very common in business world - support for everything under the sun: persistence (just hit save :), dialog UI, math formula, stats models, string functions, date/time functions (better than even Java's Joda), internationalization, localization, utf 8, and plenty more
Excel already does this to a degree. If a cell has a formula that is inconsistent with its neighbors, it will be flagged. If a formula omits cells adjacent to the range it is working with, it will be flagged.

Check out Options: Formulas: Error checking rules. It's almost as if Microsoft has been continually developing this software for years and years and they have seen many common mistakes. ;)