Hacker News new | ask | show | jobs
by ececconi 3912 days ago
I've been working exclusively in implementing BI solutions for the past five years. The thing that depresses me the most is not that BI solutions take forever to implement and cost a lot, but that clients many times just don't understand the data they are trying to report on. Many times it leads to over-engineered BI solutions just for one report that a client says is mission critical, but is never used.

I'm sure more technology focused companies don't have any issues using these self-service models, but you wouldn't believe the innumeracy that some people have in industry.

6 comments

> one report that a client says is mission critical, but is never used.

My dad started developing software in the late 60s. As a kid (let's say circa 1982, definitely in the minicomputer era), I remember him talking about a problem at work: to do all their daily processing, they needed about 28 hours. A lot of the workload was reporting, so he asked managers what reports were no longer useful. Naturally, he was assured that every report was absolutely vital to proper functioning.

His solution was just to start dropping reports. If anybody complained, he'd put them back in the job list. A significant number of reports went unlamented, and soon the computer was able to complete its daily workload handily.

The lesson I took from this is that expressed desire is often very different than actual need, so separating the two can pay big dividends. I've never used that trick, but the lesson runs all through my methods.

That's an excellent example of loss aversion. https://en.wikipedia.org/wiki/Loss_aversion
Well, the only issue I can see with that now a days is if one such report is actually needed only for compliance purposes. Then, five years down the line, an audit happens and 1825 copies of that required daily report happen to be missing...
Well at least they didn't worry about the report on leap days.
Then someone should be verifying that they're being created. Same idea with backups whereas you're suppose to check and verify them, even if you're not "using" them.
We do the same thing with physical servers and network ports. If no one knows what it does, unplug it and see what breaks or who complains. :)
There is an interesting phenomenon that I've noticed about BI: By the time data is appropriately gathered, cleaned, aggregated, and presented, the data is so closely aligned to the decision process that the decisions themselves might as well be automated and optimized.

But that never happens. We build reports so that executives can look at them and feel important while they make slower and less optimal decisions than computers could make.

This is the heart and soul of quantitative investment management.
Interesting. Mind going more in-depth?
Whilst I agree with the feeling, I think a lot of BI devs have a tendency to throw the baby out with the bathwater.

Decision making is a complex process. The graphs and data fed to the executive via BI are just inputs to the deep net of his brain, which has been trained on the "data" absorbed over decades of experience. The objectives themselves are not simple to model - management is a delicate balancing act between competing stakeholders. The job of the BI professional is not to just produce what he is told, but to figure out what problem the person making the request is trying to solve, and then solve it in the simplest way possible. Occasionally this requires teaching them some things.

Taking an example: imagine you have an engine vibrating normally. You want to set up an alarm that rings if the engine vibrates abnormally - specifically, adding a new frequency to the existing signal (maybe it indicates a screw is coming off or something). You can feed the signal as is to your algorithm, or you can put it through a FFT in which case the "signal" is just a bunch of peaks at each frequency, and your algorithm is literally just a switch (if peak at frequency f reaches amplitude A, trigger alarm). The switch is orders of magnitude simpler, cognitively, than the algorithm that is fed the raw signal; it's also likely to be more accurate. Feature engineering is almost the most important part of statistical learning.

The executive is like the alarm - pre-processing the signal is your job. They are used to simple tools, usually univariate and linear, at a stretch, some can deal with simple polynomials. The better you pre-process the signal, the easier it becomes for the executive to make a correct decision by associating the new data to whatever decades of experience he trained his brain on.

A concrete example: let's say your CMO has asked you to give him vouchers and new customers for the last 6 months. He's clearly trying to establish the relationship between his voucher campaigns and new customers. You can give him the vouchers and the new customers, daily/weekly/whatever, and put it on a nice Tableau graph and give it to him and forget about it... or you can confirm that the problem he is looking to solve is indeed the relationship between his campaigns and new customers.

At which point you ask why new customers? And you find that he has a theory that gaining new customers is the best way to increase revenue, and the objective of the company is to increase revenue (due to incoming fundraising round whose valuation is based on revenue and revenue growth), but it is short term cash flow constrained (hence looking at vouchers instead of, say, marketing spend).

Since he probably doesn't know that a model can have more than one variable, you explain that to him and brainstorm what other variables might impact revenue growth. Assuming you're trying to predict new customers per income statement dollar and new customers per cash flow dollar, you might find that online marketing spend and season are two significant variables, and that there is a significant interaction term between vouchers and marketing spend of certain types. It's now your job to explain that "formula" - standard error of coefficients included - to the executive, and brainstorm what output is required to make him be able to quickly check how this input has changed over time (which might just be.. an alarm). You'll also have to explain the measure of fit you are using (R-squared almost always wins by virtue of being very intuitive).

And of course, this is what I identify as the gap that none of these BI products can ever hope to fill. You need someone technical with full grasp over all the data sources of the company - AND the implicit and explicit data model of the business - who also happens to be continuously involved with management discussions and at least moderately aware of the business. Most companies have a set up whereby the BI team is some sort of self-service restaurant where the executive swoops in, gets his request processed, and swoops back out. Many prefer hiring young, inexperienced BI staff because BI is seen as a cost centre, and because the way they "scale" requests is by adding headcount. One offshoot of this is that the executive starts wanting a Tableau, something will all the data neatly prepared that can be drag and dropped into the 1-dimensional models that he uses to pre-process his company data.

The upshoot is that it's not that executives are stupid, but more along the lines of GIGO. Without the tools required to make sense of the signals they receive, executives cannot make the right decisions even if they have the right experience and thinking. I suspect a large part of why more experienced executives are smarter is that they learn to spot trends over decades of experience based on very simple signals; for example, an experienced hedge fund manager will sniff out a fraudulent company much faster than someone who has just started, just by looking at the financial reports.

I agree and am trying to solve that problem. Interested in collaborating? Send me an email (address in user profile).
Forget the data. Many companies don't even understand their own processes. I've gone to so many HR departments and asked them how their hiring process works and met blank stares, because there wasn't a single person who knew the entire process from start to finish. Each individual knew bits and pieces, and made assumptions on how the rest of the process worked. Even though my main role is software implementation (enterprise), I spend a ton of time interviewing people and putting those pieces together. It's probably the least favorite part of my job.
> Each individual knew bits and pieces, and made assumptions on how the rest of the process worked.

There are at least two ways that can come about:

1. laziness/apathy/incompetence

2. Someone in a more senior position culling anyone who knows too much or works too hard

I've seen a lot of both.

Yep. Any exec or manager looking to buy this service as a solution to their reporting needs should sit down and have a serious talk with their analysts or teams who build their reports first.

Innumeracy is why the mantra of asking "Why?" instead of "How?" also applies to BI reporting. A flashy new tool or report isn't going to help as much as having a builder with industry knowledge or experience.

They're not mutually exclusive options.
Technology focused companies can have the same issues. In order to build a clear understanding, it requires not just having the data available (the RIGHT data obviously, and clear and extensive documentation on specific definitions and business meanings), but it also requires education within the organization of how one should look at, evaluate, and understand the different pieces of a business.

The good thing though is that there's a tipping point, where once ~1/3 or so of people fully grok how to evaluate and understand the metrics they're looking at, those who do understand start helping (or calling out) those that don't, lifting competence throughout the organization.

I know marketing directors of large retailers who can't figure out what 10% off a figure is, and get confused as to which is gross and which is net. End up explaining VAT at least once a week.

It's cool to be innumerate, though, and if you can't do this stuff yourself there's always a nerd to blame somewhere nearby.