Hacker News new | ask | show | jobs
by rlt 1196 days ago
I think one caveat is access to training data. If proprietary models can be trained on useful data from private sources, or worse, if there are successful legal challenges against using public but copyrighted data for training, then it will be difficult for open-source models to compete with proprietary models.
1 comments

1) In a lot of western countries (EU, UK too I think) the hammer has already come down in favor of using public but copyrighted data.

2) Wouldn’t that cause open source models to be favored? A big company has lawyers that ensure that the internal practices comply with the law while on the other hand, good luck suing some random guy from 4chan who made a model that may or may not incorporate copyrighted data.

Sure there will be "bootleg" models available on BitTorrent or whatever, but generally "open source" refers to legitimately licensed code a big company with lawyers would be ok with incorporating into their own business.