Hacker News new | ask | show | jobs
by peteradio 1403 days ago
Correlation is nearly always the first step in identifying a causality.
1 comments

Unfortunately anybody who spent money attaining low level Bachelors degree goes around repeating that phrase. It's meant to shut down any discussion and signals pedantry.

It's the equivalent of "X is not a reliable source, therefore I reject all claims by X" people use to shut out anything that threatens their perception of reality.

I hate it. It's rampant on HN and reddit.

The assumption that “is associated with” is the same as “causes” is also rampant. It’s good to be reminded that causation could flow either way, or that a third thing could be the causative factor.
This is the difference between genetics and molecular biology. In most cases, genetics treats gene function as an abstraction while molecular biology seeks the underlying mechanistic process by which genes and their products function (this is an oversimplification).

There is a long history of discovering genes associated with diseases and then determining the molecular etiology/mechanism of the disease. In the case of autism we often see gene associations which seem fairly obvious- for example genes that encode for the proteins that make neural pathways- but sometimes also other genes which woudln't seem related at all or are more "general" and would affect people in many ways- a motor protein that carries things from one part of the cell to another- can be associated but it's challenging to build a true causal model.

From having worked in this field some time, the relationship between a human genotype and their body-level disease phenotype is an extraordinarily complex one, with huge amounts of nonlinear terms. Pretty much the only reasonable way to deal with this right now is to build deep models and feed them enough data to build rich representations with predictive ability. Embeddings and transformers have recently been shown to be remarkably successful in this area.