Hacker News new | ask | show | jobs
by Ultimatt 1358 days ago
VCF at least typically uses bgzip which is essentially gzipped sections concatenated, but parallel unzipable for random access, cram is also parallelisable in the same way. Maybe you just dont know the formats and tooling so well? Im not sure anyone opens a fastq directly for viewing anymore, but they will want pile ups from a bam. The problem with bio formats isnt that they're text its that they are shit text formats too.
1 comments

CRAM is a great example for some of the other people in the thread who say "just get a better format". There's been slow uptake in the larger community despite the benefits. For anyone looking to Solve Bioinformatics File Formats, it's important to understand why this is the case.