Hacker News new | ask | show | jobs
by bioweek 5987 days ago
Speaking of reproducibility, my friend getting a PHD in finance told me he was writing a paper using the data from some brokerage. I asked if he would publish the data and he told me it's confidential.

I talked myself blue in the face trying to explain how science doesn't work if you don't give people enough information to reproduce your research! I couldn't get him to understand though. Arggh so frustrating.

3 comments

It depends what type of science you're doing, and what stage the science is in. When we are at the gathering data hypothesis building stage observations, and case studies are important and are often based on confidential data.
High resolution tick data is usually NDA'ed; but anything slower than a few ticker per minute is free to publish, if you capture it yourself.

If data feeds are expensive, you can run trading software on your own machines and capture the ticks.

Assuming the data was some sort of financial time series then that means lots of people already have the data (or something close enough to it). So even though he can't publish the data he can say what the data is and when and how it was collected and that should be enough for most people in the field reproduce the results.