Hacker News new | ask | show | jobs
by ko27 780 days ago
You have a weird definition of open source. OS software developers don't release the books they have read or the tools they've used to write code.

This is fully 100% OSI compliant source code with an approved license (Apache 2.0). You are not entitled to anything more than this.

1 comments

They don't have a weird definition of open source. I recently outlined a LLM chat that I think clearly outlines this: https://news.ycombinator.com/item?id=40035688
A bunch of code was autocompleted or generated by IDEs, are open source developers supposed to release the source code of that IDE to be OSI compliant?
Is the IDE a primary input for building the program? Is the IDE a build dependency? Probably not. Certainly not based on the situation you described.

The LLM equivalent here would be programmatically generating synthetic input or cleaning input for training. You don't need the tools used to generate or clean the data in order to train the model, and thus they can be propriety in the context of an open source model, so long as the source for the model is open (the training data).

> Is the IDE a primary input for building the program? Is the IDE a build dependency?

No, the same way training is not a build dependency for the weights source code. You can literally compile and run them without any training data.

Training data is a build dependency for the weights. You cannot realistically get the same weights without the same training data.
Developer's mindset, knowledge and tooling is also a build dependency for any open source code. You can not realistically get the same code without it.