Hacker News new | ask | show | jobs
by mbreese 565 days ago
> from just the sequence and no other data

This is my real question with these... we already have a ton of other data for genomics. So, many of the important regions are already known and studied. And really, the functional importance of any given region/sequence is highly context/cell type specific. So, given this, what are the use cases? What kind of hypothesis generation can these models lead to that we aren't currently doing in genomics?

1 comments

The whole idea of unsupervised learning is to find patterns in the data that people wouldn't have easily found by manually looking for categories/labels. So far most of the categories we've identified and manually clustered (to build statistical models that find more of them) have taken extensive discovery biology and curation efforts.