Hacker News new | ask | show | jobs
by jkbonfield 2811 days ago
I am the author of an implementation, although not the author of the file format itself. Although yes that it is still a fair point if you look at just the one blog post. However there are a series of them where I clearly explain the process and my involvement, so don't just look at the last.

I agree though the message would be better if it came from a third party. I was hoping this would happen, but it didn't look likely before the GA4GH conference (where both myself and MPEG were speaking), so I self published before that to ensure people were aware and could ask appropriate questions (to both myself and MPEG of course).

As for royalties, CRAM comes under the governance of the Global Alliance for Genomics & Health (https://www.ga4gh.org). They stated explicitly in the recent conference that their standards are royalty free (as far as is possible to tell) and promote collaboration on the formats / interfaces, competition on the implementation. For the record, we are unaware of any patents covering CRAM and we have filed none of our own, nor do we intend to for CRAM 4.

1 comments

I recognize that I only have read the last post and had searched in the past for royalty information about CRAM and never found it. Thanks for your answer.

In my opinion, there is clearly the need to assess both solutions with clear and meaningful data, not only in terms of performance but also in terms of patents. Conferences are a great place to do it and thus, I completely agree with you.

However, I don't see an MPEG standard as evil (or ugly) and I do think that both types of standards (CRAM and MPEG) can coexist. Every company should decide (based on factual information) what is the best solution for their needs and if the MPEG standard brings some advantage, a company may use it despite the licensing costs. The same happens for video coding standards, where the patent heavy HEVC is nowadays used in some scenarios (e.g. iPhone) and the royalty free AOM AV1 is used in others (e.g. streaming video). It is up to the market" to decide. The main problem with MPEG-G is that the licensing information is not known yet since it didn't reach draft international standard yet.

CRAM is a standard that was started at Sanger/EMBL/EBI and developed freely in the open by the genomics community over the past decade. As others have told you, it is royalty free and unencumbered. The work behind the first version of the standard was published in 2010 (https://genome.cshlp.org/content/21/5/734.full). Since then CRAM has undergone many revisions and improvements from a plurality of sources, and is widely adopted among users whose use cases demand genomic sequence compression.

MPEG-G at this point is a three year old attempt to patent troll the genomics community and grab money. The people involved are not experts and are not aware of the actual state of the art or motivating requirements in sequence compression, and are instead trying to dance their way around prior art, as evidenced by the contents of their patents and presentations.

There is no equivalency here to fit your worldview.

Some of the MPEG-G authors are experts in genomics data compression, while others are experts in video compression. It should, in theory, be a good mix.

MPEG are also well aware of the prior art. The authors of various existing state of the art genome compression tools were invited to one of the first MPEG-G conferences where they presented their work. Do not assume because they do not compare against the state of the art, that they are not aware of its existence or how it performs. It's more likely simply that "10x better than BAM" is a powerful message that sells products, more so than "a little bit better than CRAM". It's standard advertising techniques.