Hacker News new | ask | show | jobs
by sgt101 4084 days ago
No theory (random classifiers) aggregated to optimize on a non representative hold out set form a theory on that set? I think this is expected. If you create classifiers that express some domain theory on the training set in step 1. and use the information in the hold out differently you'll do a lot better (I believe - well I think I saw that result when I did my Ph.D 17 years ago).

Here is a very bad, very bad, very old, very old, AAAI workshop paper that sums up the idea (the journal paper is behind a pay wall.

http://aaaipress.org/Papers/Workshops/1999/WS-99-06/WS99-06-...