Hacker News new | ask | show | jobs
by koolala 664 days ago
An open source model would be able to give you the sources of its potato salad recipe inspiration. It would be the best of both worlds. AI Knowledge + Real Open Human Knowledge.
2 comments

Just because you have the dataset doesn't mean you can generate a reference. Let's say I hand you a potato salad recipe and a copy of the entire internet. Say you somehow extract all potato salad recipes from the dataset (non trivial btw) and none of them are an exact match for the recipe the model generated. Now what?
> open source model would be able to give you the source of its potaeo salad recipe

Kagi’s LLM can already do that. I believe so can Perplexity’s. Citing sources isn’t something only open models can do.

I'm pretty sure Kagi is like a normal search engine with AI integration like Google. Not an AI designed to be open source with an open dataset of knowledge it was trained on.
> pretty sure Kagi is like a normal search engine with AI integration like Google

Sure. The point is the thing you said only an open-source model can do, it can do. Plenty of proprietary LLMs can cite sources.

The plain truth is most of the benefits of open models are not on the consumer side. (Or at least, I haven't seen any articulated.) They're on the producers'. Open models are better for those of us training models. That's partly why the open data debate is academic--very few people are training large foundation models because the compute and electricity costs are prohibitive.

I'm kinda hoping World Governments will use their Public Library infrastructure to train AI. Japan is my #1 hope with how they are opening public science knowledge. Super-computers have been prohibitive for a long time but national science institutions could be a great place for open source & open weight AI.
> hoping World Governments will use their Public Library infrastructure to train AI

Genuinely blown away the EU isn't doing this.

In the U.S., the solution may be in carving a legal safe harbor for companies that release their models per the OSI's draft definition of open source.

I bet Nvidia would quite like that too. Private and public-sector funding, theirs for the taking! Few businesses are ever so lucky.