Hacker News new | ask | show | jobs
by koolala 664 days ago
Data that is accessible. Knowledge. Truth. With an AI trained on it that can expose it in any expert / layman terms into any human language.
1 comments

You’re undermining the case for an open source LLM by stating things fully-proprietary models do.
They don't make the source data accessible :(
> they don't make the source data accessible

No. But you haven’t articulated why making everyone’s Facebook chats public is a net good. What does opening that data up confer in practical benefits?

Given what we know about LLMs, one trained only on public-domain data will underperform one trained on that plus proprietary data. If you want source data available, you have to either concede the "open" models will be structurally handicapped or that all data must be public.

You think Llama is trained on peoples private messages? :( That isn't good...
> You think Llama is trained on peoples private messages?

Facebook says no, at least for Llama [1].

[1] https://itlogs.com/facebook-uses-user-data-to-train-ai-but-l...