Hacker News new | ask | show | jobs
by vamin 3005 days ago
I'm talking about the raw reads, which is important if you want to try a different alignment or base-calling method. You can debate how important it is to be able to do that, but I'm not trying to argue that the data should be kept, I was just explaining why the total size of publicly available RNA-seq data (the sum total of which the parent is attempting to organize) runs in the petabytes.
1 comments

So, do you or the original poster actually have a materialized petabyte of RNA data? Otherwise, you're just describing a million files spread over a million file servers, not being used for science or processed in any way.