Hacker News new | ask | show | jobs
by patall 3517 days ago
These signatures will never be unique or complete. Remind you that this is biology where there is not the TRUTH to be found. At some point you can always subdivide one phenomenon in to two cases. These signatures are generated from non-negative matrix factorization which generates signature vectors more or less stable depending on the amount of input datasets and the number of signatures you want to obtain. And at some point you are limited by the amount of money you have for these dataset (still more than $1000 per patient) and how many patients exist in the end (there may be correlations we will never get to know because the search space is much bigger than the 7 billion people we are can provide in correlations.
1 comments

You leave out the fact that there is a known carcinogenic mechanism for smoking, and this corresponds to the signature derived by NMF. In vitro studies in controlled systems exposed to smoking carcinogens also reproduce this signature. In addition to the NMF signature, there are other features like transcriptional strand bias, and dinucleotide substitutions. This is a bit more than an association.