|
|
|
|
|
by Kuraj
201 days ago
|
|
I dunno man. When I first joined it was unconcieveable that someone could just take everything and build a trivially queryable _conversational_ (that's a big part of it) model around everything I've posted _just like that_. Call me naiive but I would consider it some sort of a social contract that you would not do that. I feel the same way about LLMs being trained on Reddit. I suspect with a large enough dataset these models can infer things about you that you wouldn't know about yourself. To make another example, even though my reddit history is public (or was until recently because I didn't have a choice) I would still feel uneasy if I realized someone deliberately snooped through all of it. And I would be SUUUUPER uncomfortable if someone did that with my Discord history. It's not against the rules or anything, I just think it's rude. |
|
Making the content queryable by a database engine is merely a technical optimisation of the efficiency with which that content may be consumed. The same outcome could have been accomplished by capturing a screenshot of every web page on the internet, or by copying and pasting the said content laboriously by an imaginary army of Mechanical Turks.
A private network may, of course, operate under an entirely different access model and social contract.