|
|
|
|
|
by littlestymaar
259 days ago
|
|
> The real issue is they tested on data in their training set. Hm, no. They trained on a part of their synthetic set and tested on another part of the set. Or at least that's what they said they did: > from which 1,000 were held out as a benchmark test set. Emphasis mine. |
|
The difference is subtle but important. If we expect the model to truly outperform a general model, it should generalize to a completely independent set.