Hacker News new | ask | show | jobs
by oberstein 3869 days ago
The data should speak for themselves. Unfortunately people who don't know statistics think they can't, because as we found out from emails, the data were "manipulated" via statistical methods: https://en.wikipedia.org/wiki/Climatic_Research_Unit_email_c... Honestly I'd rather we press scientists of all sorts for all their source code and modeling tool configurations including the raw data that hasn't been post-processed to hell in an "acceptable" way (recalling that the statistics involved in generating p-values were "acceptable" for psych journals until recently).
2 comments

Any science based on simulation, any member of the public should be able to download, compile and run on their PC and get exactly the same results. It's the only way to be completely transparent. C and FORTRAN compilers are free.
It's important to recognise that a lot of the model runs are done on super-computers, and the runs themselves aren't always easy to replicate from a practical point of view, because of things like resource limitations and configuration issues.

I certainly agree that things should be open sourced, probably allowing for some kind of embargo system for publication, but we shouldn't allow a rhetoric to develop suggesting that people should be making things easily replicable, because that essentially means making very portable and resource-flexible code, which is not trivial, even in less complex domains, as I'm sure many people on this forum recognise.

As an example, many models make use hundreds of cores via MPI (along with preposterous amounts of RAM), and whilst I am no expert, I get the impression that porting such things to more commodity hardware is essentially impossible.

As an aside, I have seen more bad-faith practice (i.e. manipulation of results) in industry than in science, and have spent approximately equal periods of time in both.

Today's supercomputer is a GPU in 5 years time.
Maybe in terms of various metrics (though I'm not so certain things are progressing that quickly), but not in terms of source compatibility, which was the point I was trying to make.
You can find NOAA's current climate model here, with source code, compile scripts, and sample data:

http://www.gfdl.noaa.gov/cm2-5-and-flor

However, as mrow84 points out, you will need a supercomputer, unless you have a lot of time on your hands.

No evidence was found of wrong-doing at the Climate Research Unit. The only result was it provided a source of comments to be taken out of context by climate sceptics. Which is exactly what Lamar Smith is hoping to achieve here.