Hacker News new | ask | show | jobs
by microtonal 310 days ago
For me, I think the value of sharing with other people still outweighs the 'leeching' by proprietary for-profit vendors. Ideally, I'd like to be able to set an option on a repo: no LLM training, only training material for open weight models, and training material for all models. Though sadly, I do not think we could trust all vendors to respect such a flag, so now it is all or nothing.

Personally, I would select only training material for open models. Pandora's box is already open, but if we are going to have LLMs, I want them to be available to everyone and not gated by a small number of companies. Since open models generally have less of a benefit of scale in terms of GPUs, I want them to have the benefit of more available/higher quality training data.