| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by phoenixstrike 1102 days ago

This is good research.

>One thing that troubles me in the paper is that the researchers appear to have gone looking for precursor patterns in an ad hoc way, with no physical theory in mind, just trying different binning techniques and delays until they got a signal.

There is nothing wrong with this. In fact this is how most science is done. This is pure experiment - try things and see what comes up.

You're conflating this step with step three of the general way things have traditionally been done in physics:

1. An experiment shows a previously unexplained phenomena.

2. A theory is made to explain the results and predict the results of a future experiment.

3. A future experiment is undertaken with this theory in mind, to see if it has predictive power. If the predictions are correct, it is a good theory.

Your comment is referring to step three. The experiment in the paper is step one.

4 comments

dekhn 1102 days ago

While I don't disagree with your description, there is an awful lot of scientific output which is really just fishing for significance (IE, runnings lots of tests without corrections), publication, claim credit for discovering something, and it never actually gets followed-up on to see if the claim generalizes.

https://en.wikipedia.org/wiki/Data_dredging

I think probably the most important thing is to get scientists to spend more time identifying and teasing out correlated variables to identify plausible mechanisms.

link

ordu 1102 days ago

This problem is mostly a problem of social sciences. In physics it is important to have a theoretic explanation, if there is no explanation then physicists become excited and start to dig really hard. They value theoretic explanations not correlations. In contrast social sciences lack a good theory, they substitute quality of a theory with a quantity of theories. So it is even important correlations often impossible to explain without resorting to ad hoc theories.

You can see in this article that authors already suggest a theoretic explanation, and I do not doubt that we'll see follow up studies trying to clarify situation.

link

dekhn 1102 days ago

it's quite common in medical research, even highly quantitative work. And I've seen it in every field I've worked in, which spans biology, physics, chemistry, typically with a quantitative bent.

I once had an advisor edit my draft over night and submit it as a paper with a bunch of juiced up numbers that weren't true, but made sense to the advisor even if the underlying scripts I ran didn't support it. I complained to them and the paper was withdrawn before publication, and immediately left their group. this was in quantitative biology- hard core bioinformatics with very sophisticated modelling.

But yeah, real experimental physics is hard to fake since reproduction is usually more straightforward than in other fields.

link

busyant 1102 days ago

> I once had an advisor edit my draft over night and submit it as a paper with a bunch of juiced up numbers that weren't true,

I'm stating the obvious here, but that is not a good advisor in any sense. It must have been difficult to leave, but it would be the only reasonable response.

link

dekhn 1102 days ago

Well, if I'd wanted a career in science and didn't have ethics, then they would have been a good advisor because they knew exactly how to ride their wave of falsehood to a professorship at Berkeley.

It wasn't hard to leave, I just contacted another professor at berkeley and joined their lab the next day. The new advisor, while fairly dull, was methodic and pedantic and the idea of faking or juicing results would probably never have occured to him.

In short, in science if you're not a super-genius, it can be hard to compete with the super-geniuses and the cheaters. I found it easier to move to computer engineering than stay in science.

link

busyant 1101 days ago

So, you're basically telling me that there is (or was) a bioinformatics prof @ Berkeley who was fucking with the data.

Yeeeesh.

I guess my science career was relatively clean. I knew a few fellow students who got screwed over by their advisors in the sense that the advisors demanded an excessive amount of publishable work to graduate.

And I saw plenty of personality conflicts, many of which could be lain squarely in the lap of the advisor.

But I never saw or heard of outright fraud, which makes me happy.

I'm not naïve. I know fraud is everywhere. And I know there's a lot of pressure to produce interesting results. I probably just got lucky.

edit: for anyone taking the plunge into grad school. I made my choice of advisor largely based on his reputation of looking out for his students ... and on his research as a secondary consideration. That may have helped me.

link

ordu 1102 days ago

> This is pure experiment

Pedantically speaking, it is not an experiment, it is an observation. Experiment is a kind of study where you control independent variable. In this case no cosmic radiation nor seismic activity were not manipulated by scientists. It is the reason why they speak about correlation but not causation.

https://en.wikipedia.org/wiki/Experiment#Observational_studi...

link

idlewords 1101 days ago

Randomly sifting through data in search of patterns is not an experiment in the usual sense. With a big enough data set, you're guaranteed to find one in a billion, one in a trillion events by random chance.

link

s1artibartfast 1101 days ago

Yes, but you can test those signals against future data and see if they are accurate.

link

admax88qqq 1101 days ago

Finding correlations is not the same as finding one in a trillion events.

link

kbelder 1101 days ago

When you have a trillion possible correlations, it is.

link

DangerousPie 1102 days ago

It's fine if they account for the number of tests they have made when they calculate their significance levels. If they just kept on trying different options until they ended up with p < 0.05 it's almost guaranteed to be noise.

link

ordu 1102 days ago

They used p<0.001. It is not social sciences, there anti-noise filters are stricter.

link

elcritch 1101 days ago

Ah that's not too bad. Though to be fair you also need the data size. That's only what one in a 1k chance (or is it 10k? Too lazy to count it out). If their dataset is small or they automated testing cofactors there's still a decent chance of false probability.

link