Hacker News new | ask | show | jobs
by aheifets 4122 days ago
Thank you! Personally, I find it very exciting to be working on these problems.

With respect to boosting, we have more investigation to do, of course; the tricky issue with the biological domain is that we know the underlying data is incredibly noisy. How to walk the line of extracting maximum predictive performance without overfitting is the challenge, since we know that a lot of the raw data points are unreliable. Any algorithm we use has to be able to handle this scenario deftly.

1 comments

Absolutely. There has been some work specifically on boosting in the presence of noise --see for instance [1], and Sec. 12.3.3 of Schapire's book-- using branching programs/BDDs as base learners. It's definitely worth taking a look.

[1] http://research.microsoft.com/en-us/um/people/adum/publicati...