Hacker News new | ask | show | jobs
by Towaway69 1244 days ago
From guide linked above:

> It is remarkable that such large multi-lingual model is openly available for everybody.

Am I the only one thinking that this remark is a insight into societal failure? The model has been trained on global freely available content, anyone who has published on the Web has contributed.

Yet the wisdom gained from our collective knowledge is assumed to be withheld from us. As the original remark was one of surprise, the authors (and our) assumption is that trained models are expected to be kept from us.

2 comments

I think it’s similar to how search engines keep their ranking formulas secret, and you can’t run your own off a copy of their index.

Yet we also all contributed to it by publishing (and feeding it, for instance by following googles requirements for micro data). But we don’t own any of it.

Main difference with a search engine is that a search engine ultimately links back to you. So the user, interested in more or want to know where it comes from, ends up on your website.

The same is not true for these AI tools. The output could have been contributed by you, someone else, or everyone, or a combination of those, but it'll never be clear who actually contributed and there will be no credit to anyone besides the author(s) of the models.

Didn’t think of it this way, that makes sense. Thank you
How much money you think gpt3 training costed?
How much money do we spend contributing to the training set?

Those insights, comments, articles, code example, etc are free to use because we published those on sites that don't own the content but earn from it. If they owned them, the they would be responsible for hate speech.

So our costs for producing the training set is negligible.

I recommend reading the first few chapters of "The conquest of bread".