| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kelipso 499 days ago
	It’s a common complaint on open sourced ML models that they don’t provide or describe the data used to train the model. Sometimes it’s a valid complaint, since it may not be clear what kind of data was used to train the model, and sometimes it’s not since it’s clear. I think it’s kind of an overdone complaint and I usually ignore it, and besides it looks like there’s a huggingface project ongoing where they’re trying to replicate the training process for this model anyway.