Hacker News new | ask | show | jobs
by zweep 2306 days ago
90% of the world’s software developers don’t realize that they are software developers and their language is Excel formulas.
1 comments

While that's true, a signification fraction of the remaining 10% is software devs cleaning up after the damage done by people who have over-estimated their ability and created a bunch of business solutions in VBA...
Those same business solutions have generated more value than costs.
You only say that because the value you mention is usually visible whereas the costs are usually hidden or not obvious.

For example, in my experience Excel is really good at providing job security to those who develop complex spreadsheets. But the cost to the organization tends to be enormous as well, because spreadsheets are impossible to scale, which in turn makes the processes that rely on them impossible to scale.

Any business that runs into that sort of scaling problem is already doing quite well for itself. How many of them would have gotten to that point if they needed to hire a software engineer to make every prototype that eventually led to the final working system?
I currently work for a business that has done quite well replacing Excel solutions with proper databases. We've literally worked through the collapse of one company (and its rebirth after it was merged with something else) partially due to the sheer impossibility of managing Excel solutions that had grown out of control. Yes, they mainly grew out of control due to management problems, but the way they did so was remarkable in its rapidity and absurdity. Users were being bought machines with 16gb RAM so they could actually track their sales.
16GB of RAM costs about as much as employing a programmer for a day.
Do you have a workflow or methodology that you use or do you analyse and develop from scratch?

What are you replacing the spreadsheets with, desktop to web apps?

The people developing complex spreadsheets have job security because they are domain experts. They can't be replaced with a generic developer, even if the spreadsheet is converted into custom code.
Yep. I've seen this pattern of "We need to convert this Excel-based process to an application". Then it takes the domain export + business analyst + developer to make any changes to the process. The final product is more polished, but way more expensive and slower to iterate.
plus, when a generic developer writes the custom code it eventually evolves into an endless cycle of implementing Excel features the domain expert is use to having a click or two away.
Plus, in my observation, there is often a dismissive attitude towards Excel among the generic developers which blinds them to the risk of this “re-implementing Excel’s features” cycle.

I worked on a team that helped business users who had "outgrown excel", i.e., they had hung themselves with the rope Excel provides. Almost always their scaling problems were solved simply by better Excel practices: better management of the calculation mode; setting the RTD throttle interval to 1-2 seconds; replacing Bloomberg's streaming data function (BDP), with the native alternative {=RTD("BLOOMBERG.RTD...)}; optimizing the division of labor between what is done on the sheet with formulas versus with vba\xll code; meta programming, i.e., creating all or part of your calc sheets and formulas with code so that you get the understandability and observability of Excel formulas while avoiding the things that are hard to do with formulas, such as grouping, joining, filtering, looping; making 3rd-party add-ins workbook-specific such that they're not always on (many of these add-ins listen to application-level events like selection_change and on_calculation which can diminish the performance of all open Excel models, even the ones that don't use that add-in).

I am continually telling my engineering manager that just because he can do somethign in Excel it in no way means that he should.

He is a smart guy, but refuses to learn even basic Python or R, despite doing some very significant statistical work in an area related to preventing machines harming humans.

I just wonder, even if initially perfect, how many spreadsheets have been unknowingly perverted by someone sitting on the mouse, or a pet cat treading the keyboard.

Indeed, Excel is usually not the right tool. It is very powerful, but organization of the code/logic, correctness of the calculations, and readability/maintainability are all left on the developer. Mistakes are hard to spot. The levels of discipline and meticulousness needed to use the tool well are high.

I have at times remarked that typical Microsoft Office installations should exclude Excel! :-)

And I am saying all this about Excel even after being a power user myself and the primary author of the OP. :-)

Mathcad, Mathematica, etc. could be good alternatives if not Python or R.

> because spreadsheets are impossible to scale, which in turn makes the processes that rely on them impossible to scale.

Spreadsheets are entirely possible to scale, that just tends to be a skillset in and of itself, and domain experts organically creating a complex spreadsheet likely don't have the background in process engineering and software design principles to do so themselves.

Generally speaking, you can usually refactor an unmaintainable and complex spreadsheet to mimic software engineering best practices, all without dropping down to VBA. Leveraging named ranges[1], locked cells[2] and formulas[3], data validation[4], and error handling[5]. Combined with some defensive validation checks, you can generally sort out the complexity issue nicely.

Additional enhancements (based on Excel 2010+) can be made using tables and structured references[6], factoring out "data" worksheets into their own workbooks[7] and linking to them from the "user" workbook, adding relational integrity via a data model[8], or scaling data size via PowerQuery[9] (which stores data within the Excel file in a highly compressed, columnar format that transparently gets processed by a local instance of the same VertiPaq engine[10] that powers SQL Server Analysis Services)

You can also drop down to VBA or Javascript[11] if you truly want/need to jump out of the rails of the built in options above. Or in more common cases (which leads to the hell-to-maintain spreadsheets that are more common), if you want to bypass all of the nifty built-in functionality above and do something quick-n-dirty. But if you leverage the above capabilities, you can mature a spreadsheet-based solution quite well and have a battle-tested, stable PoC that can be handed off to a software developer for migrating into a more permanent application.

[1] https://trumpexcel.com/named-ranges-in-excel/

[2] https://support.office.com/en-us/article/lock-cells-to-prote...

[3] https://support.office.com/en-us/article/display-or-hide-for...

[4] https://support.office.com/en-us/article/more-on-data-valida...

[5] https://www.exceltactics.com/definitive-guide-excel-error-ty...

[6] https://support.office.com/en-us/article/overview-of-excel-t...

[7] https://www.microsoftpressstore.com/articles/article.aspx?p=...

[8] https://support.office.com/en-us/article/create-a-data-model...

[9] https://en.wikipedia.org/wiki/Power_Pivot

[10] https://www.microsoftpressstore.com/articles/article.aspx?p=...

[11] https://docs.microsoft.com/en-us/office/dev/add-ins/excel/ex...

Good point. I've seen miracles occur simply by refactoring an Excel workbook such that the inputs, outputs, and calculations to map the inputs to the outputs were all delineated into their own clearly-formatted bounded context. And also, when you do drop into vba code, simply organizing the code so that interactions with the sheet are easier to understand, and isolated -- and focusing on "seriality" (no feedback loops, minimize code triggered by events) can transmogrify vba code from the Bogeyman into a friendly neighbor.
For sure with the transmogrification! If the rest of your workbook is properly designed, dropping into VBA for some array operations and direct references[1] can drastically improve performance as opposed to per-cell calculations and looping. And can cleanly compartmentalize functionality in a way that can be refactored into other languages during post-spreadsheet migration as needed.

Although if it fits your usage, PowerQuery and M[2] can be even more performant, if for no other reason than the data being in a more efficient/compressed format. With the nice side effect of creating logic that's transferable as-is from Excel to PowerBI or SQL Server Analysis Services (making for a clean migration path as your solution matures).

[1] https://www.microsoft.com/en-us/microsoft-365/blog/2009/03/1...

[2] https://docs.microsoft.com/en-us/powerquery-m/

You haven't seen anything until you have seen large process plant importing and exporting data from Excel via OPC to allow the non programming process engineers to tweak performance.

Really not the intended use or appropriate criticality management of Excel as a tool, let alone the O/S it runs on, yet it is done.