Hacker News new | ask | show | jobs
by jashephe 1537 days ago
Global models of gene expression for an entire cell are fairly distant at this point, but there is quite a bit of work into modeling transcriptional activity from sequence. If you're interested in reading more, a relevant technology to search for would be the "Massively Parallel Reporter Assay", or MPRA, which couples pools of 10⁴–10⁵+ synthetic DNA sequences with RNA sequencing to measure transcriptional output. Data from MPRA experiments is being used to train models, although these models are not anywhere near a point where you could model the gene expression of all regulatory elements in a cell; they are usually focused on a specific factor or regulatory sequence.
2 comments

The "train models" or ML portion is what I'm disappointed with unfortunately. I make ML models to predict things from genetic information somewhat regularly, but we all are aware of the enormous issues with that. I am more interested in the ab initio methods, as I have seen them be spectacularly useful in other fields - like Bethe salpeter equations in condensed matter physics.
Notably DeepMind had a recent paper on using transformers to predict long range interactions in gene expression: https://www.nature.com/articles/s41592-021-01252-x