Hacker News new | ask | show | jobs
by thegrossman 4683 days ago
Also: Validation is tricky, since we can't just compare the output to ground station observations, as we incorporate ground station data into the model. Eventually I want to generate alternative versions that randomly exclude specific stations so we can use them for comparison.
1 comments

I think RTMA already includes ground station measurements, so analyzing performance using a leave-N-out strategy wouldn't be a good verification:

http://nomads.ncep.noaa.gov/txt_descriptions/RTMA_doc.shtml http://eamcweb4.usfs.msu.edu/mm5-case/RAWS/RTMA%20papers/pon...

Instead, I think you'd need to find temperature measurements that are completely independent and use them for verification. Along this line, I'm not sure how refitting the data to ground stations would produce a better match anywhere except at those ground stations (overfitting). Or are you using ground stations that are truly independent?

When we compare it to RTMA, we leave out RTMA from the list of data sources. Likewise, eventually I'd like to do the same with a subset of the ground stations we use.

(The problem with finding completely independent measurements is that we'd want to use them as an input!)