Hacker News new | ask | show | jobs
by carbocation 5354 days ago
For statistical genetics at least, it's common to process much of the data in parallel, so the RAM limitations on one R instance are not the gating factor.
1 comments

Having seen and heard about what Bioconductor had to do to process genetic data, memory is a huge issue. It is even more so with next-generation sequencing data.
Yes, I guess I've always operated under the assumption that I've needed to parallelize dramatically. I usually operate on data from families of ~40 people with next-gen sequencing data, and the tools that I use generally finish within about an hour.