Hacker News new | ask | show | jobs
by ethbr1 961 days ago
I'd phrase characterizing the reliability of out-of-sample performance a priori as impossible, but not necessarily automatically failing.

There may be a subtle correlation between properties needed to answer a specific out-of-sample request and in-sample features.

Unfortunately, prior to training/testing and without recognizing that correlation in the data set, I believe it's impossible to guarantee the model will include it. (Corrections welcome)

1 comments

In essence: “You cant know in advance how far the model can approximate semantic patterns”

So claiming that out-of-sample performance is a mirage, would be a bridge too far?

Maybe "a mirage that might actually be true"? Which is a terrible thing to rely on! Unless it's usually true?
That measurement is the core of my current tasks. If you don’t know the error rate - then what are you doing ?
Delivering what some executive promised when they told investors 'the company is using AI.' /s
A Virtual beer/poison of choice to you and mjburgess in this thread.