|
|
|
|
|
by epistasis
1054 days ago
|
|
Right, most people that try to really optimize these things do not have access to the parallelism tools thay Google has built, and end up doing their own ad-hoc sharding schemes. Things that can be built by 1-3 people over the course of a few weeks tk solve ann immediate scaling problem. And of course BAM itself dates back to before standardized serialization formats were brought out of Google. Even with potential optimizations, initiating a seek on GCS or S3 is far far slower than on a local SSD, so even if Google exposes fast cross-network seeks on objects inside an internal object store system, it is not readily accessible to the plebes like me and 99.9% of genomicists that use cloud systems or their own hardware. |
|