|
|
|
|
|
by gabeiscoding
3813 days ago
|
|
While I think it's great to have Google putting their weight behind standardization efforts like Global Alliance for Genomic Health (GA4GH), I really don't get the need to replace VCF and BAM files with API calls. Ultimately, the "hard part" about genomics is not big-data requiring Spanner and BigTable to get anything done. I actually wrote a blog post about this this week: http://blog.goldenhelix.com/grudy/genomic-data-is-big-data-b... Both BAM and VCF files can be hosted through a plain HTTP file-server and be meaningfully queried through their BAI/TBI indexes. Visualization tools like our GenomeBrowse or the Broad's IGV can already read S3 hosted genomic files directly without having an API layer and very efficiently (gzip compressed blocks of binary data). So, I see the translation of the exact same data into API-only accessible storage system, where I can't download the VCF and do quick and iterative analysis on it more of a downside that plus. Disclaimer: I build variant interpretation software for NGS data at Golden Helix. Our customers are often small clinical labs who size of data and volume are not driving them to the cloud. |
|