| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by RVuRnvbM2e 1051 days ago
	This kind of research really highlights just how wrong the OSI is for pushing their belief that "open source" in a machine learning context does not require the original data. https://social.opensource.org/@ed/110749300164829505

1 comments

They really just seem bad faith in this thread. Just publish the training data FFS (medical data excluded)

then it wouldn’t be the training data, it would only be a subset.

if the issue is sensitive data in a training dataset, perhaps that should be addressed rather than accommodated.

The point is that they seem to pretend they want to redefine the meaning of open source BECAUSE of medical data.

Just say medical data is not open source and make the rest really open source

even reference would be sufficient if access were not controlled and denied under the guise of protecting people.

the problem is the source medical data itself is insufficiently cleansed. (if it can be at all.)

ideally the medical data is open source, but only contains what’s necessary, and not what’s sensitive.

that is, obviously, messy…