Hacker News new | ask | show | jobs
by malshe 3021 days ago
Just below what you reproduced, they write:

When we examined the confidence intervals for the remaining 8 tests, we found that for half, the mean of accuracies for features written over synthesized data was higher then for those written on the control dataset.

In other words, for 4 out of remaining 8 cases, the models on synthetic data performed better.

1 comments

Yes, I did leave that out, as I think it's still an issue. A synthetic model performing better is a little dubious, since the modeled distribution has less information than the original one. Overall, the discrepancy seems more important to notice than the actual performance.