Hacker News new | ask | show | jobs
by coverup 2193 days ago
Yep, you and the user you're replying to are both right in different ways. One thing's for sure - machines don't generate "insights" on their own.

Let's define an "insight" as "new meaningful knowledge", just for fun. We could talk about what comprises "new" and "meaningful" but it would be beside the point I'm making.

In a supervised learning problem, the range of possible outputs is already known, meaning the model output will never be categorically different from what was in the training data. The knowledge obtained is meaningful as long as the training labels are meaningful, but it can never be new.

Unsupervised learning doesn't have a notion of "training data" but that means an unsupervised model's output requires additional interpretation in order to be meaningful. It is possible to uncover new structures and identify anomalies in new ways, but this knowledge isn't meaningful until someone comes in and interprets it.

Applied to the specific example where sensor data is used to try to generate insights about machine functionality: Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

It's the Roth/401(k) tradeoff. For model output to be useful, someone must pay an interpretation tax. The only choice is whether it is paid upon insight deposit or withdrawal.

3 comments

> Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

Yup, this is something I've seen from both sides. First you mention is basically the standard, while the last is part of the deep learning voodoo black magic that executives and sales love.

I've had people approach me with proposals like "What if we just churn [ALL OF] our data through this or that model, and let's see if it comes up with some patterns we've never seen or thought about"

And that's not just for industrial applications. It's everywhere.

What is concerning to me is that this mentality will surely induce more unrealistic expectations. Before you know it, business execs are starting to ask why we need business analysts at all, because surely those fancy deep neural networks can extract all kinds of features - "only need data scientists to figure out those things".

So yeah, that's my fear. That businesses will blindly start to discard domain knowledge, and just feed black-box models their data, and let the data scientists wrestle with the results.

It’s not only outputs/labels that provide “insights”.

Knowing how the outputs relate to the inputs is where most new insights could come from.

For example, what feature (input) is driving the “failure” of the machine (output/prediction)?

This is where ML explainability comes in.

This is demonstrably false; AlphaGo made significant new discoveries, for example.
Yeah this is where it would have helped if I had discussed what I meant by "new".

AlphaGo is a supervised learner that outputs optimal Go moves given opposing play. It yields new discoveries in the same sense that a model designed to predict mechanical failures from labeled sensor data would: I didn't know what the model was going to predict until it predicted it, and now I know.

But what the factory owners want is a machine that can take raw, unlabeled sensor data and predict mechanical failures from that. They want insights. "Why not just feed all our data into the model and just see what comes out?" they ask. "I don't see why we need to hire at all if we have this neural net."

The reason you need a human somewhere in the system if you want insights is because someone needed to program AlphaGo specifically to try to win at Go. At the factory, someone needs to tell the machine what a mechanical failure is, in terms of the data, before it can successfully predict them.

Then, neither "winning at Go" nor "mechanical failure" are states that the system hasn't already been programmed to recognize. That's what I mean when I say a supervised learner cannot generate "new" output.