Hacker News new | ask | show | jobs
by ArsenArsen 1972 days ago
Coming back with information from #xiph on freenode:

  16:57 <ArsenArsen> where and under what license is the training data used for RNNoise?
  18:38 <rillian> ArsenArsen: There's a copy of what I believe is the training data on the xiph server, but afaik it's never been published
  18:39 <rillian> the original submission page has an EULA waiving copyright and liability claims, and agreeing that it _may_ be released CC0.
  18:40 <rillian> it looks like that didn't actually happen.
  18:41 <rillian> there may have been concerns about auditing it for privacy issues, but there's a lot of audio to listen to, 6.5G compressed
  18:41 <rillian> jmspeex, TD-Linux: what's the status of publishing the rnnoise training data?
  18:43 <jmspeex> Are you talking about the data that was used to train the default RNNoise model or the noise that got collected with the demo?
  18:43 <rillian> jmspeex: I think debian just cares about the training data for the default model.
  18:44 <jmspeex> There was never plan to release that -- it includes data from databases we cannot release
  18:44 <jmspeex> but I don't see what the issue is. Distributing the model is not the same as distributing the data
  18:45 <rillian> ah, I see. I didn't realize you'd used proprietary sources as well.