Y
Hacker News
new
|
ask
|
show
|
jobs
by
sheepdestroyer
501 days ago
They could easily list the data used though. These datasets are mostly known and floating around. When they are constructed, instructions for replication could be provided too
1 comments
coliveira
501 days ago
They could, but even if they give this list the detractors will still say it is not open source.
link
rvnx
501 days ago
yes and as a bonus they may get sued, which in the long-term, makes free / offline models to not be viable
It would be so much better if all models were trained with LibGen.
link