Hacker News new | ask | show | jobs
by Gigachad 1274 days ago
I fear the day that we get a run on your own machine version of ChatGPT. Within a week I think virtually every public signup forum will be crushed by bots which act and talk more natural than some humans.

You’ll just expect that most of the users on discord are bots unless you know them irl.

15 comments

> You’ll just expect that most of the users on discord are bots unless you know them irl.

Maybe if all you ever exchanged with someone was a few sentences about inconsequential stuff. ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Right now the performance of ChatGPT in your average discord chat room would be absolutely awful. Nobody is bothering to write their messages as proper prompts a state-of-the-art AI could understand. People don't even just talk about one thing at a time - having multiple threads of conversation interspersed with each other is quite common and would absolutely confuse any current AI. There's also in-jokes and subtle references to things said days ago. People try to only use minimal information to convey what they mean, and often make figuring that out a small puzzle/joke to keep the conversation interesting.

> Maybe if all you ever exchanged with someone was a few sentences about inconsequential stuff.

Like usually on HN, you give humans too much credit. Most people never exchange anything else with anyone but inconsequential stuff. IRL and online. The AI will give itself away by being too eager to write content and too grammatically/semantically correct and not random and inconsistent enough.

Go to some places (instagram is pretty good for that) where people basically only communicate in emoji's and memes and where every sentence longer than 4 words is misunderstood because no-one actually has reading comprehension.

> Right now the performance of ChatGPT in your average discord chat room would be absolutely awful.

Most of the internet, including many discord channels are already absolutely awful.

So yeah, it is a problem; for me, most people responding outside a few discord channels, subreddits, lobsters and hn, it's already bots or humans that could as well be bots anyway. And only a matter of time before it takes over those few places that are still ok.

I will probably move to meetups offline more and more. The opposite of what I did my whole life.

> Most people never exchange anything else with anyone but inconsequential stuff. IRL and online.

If my interaction with them is that shallow, does it matter if they are real or a bot?

No, that’s kind of my point. It doesn’t matter for most people, so there will be some (futile) resistance and that is it.

The problem is when AIs become such high quality that they can influence people’s minds and behaviours through psychological tricks. This is also already happening but mostly humans are creating that content which is upvoted and responded to by their bots.

This is going to be AI soon-ish and that is a problem imho. But not something we will be able to do much about except KYC people as rigoureus as banks do (bye bye privacy, unless this is implemented well) or making them pay to post. Both won’t kill all bots, but will shrink the reach of them as suddenly it’s a very costly affair to do at scale (buying real people and paying to post).

Spelling mistakes can easily be added, as can reply delays. I think the mainstream will soon realize that Sybil resistance is not just important but essential for any forum or social network in the future. We need ways to ID users as humans, preferably in a way that preserves privacy for those who do not want to have their real identity linked to random internet comments.
You can do KYC with a KYC specialized company and agree on some way of not having the social media company getting hold of that info, just ‘user4639284757483858 is a real human and they are not already on your platform’.
Until they get hacked and your entire identity is leaked, not just Facebook and HN but all other accounts, linked to your government issued ID or SSN.
> ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Is it really not close, or just not done yet? They're already using human feedback loops to get it as good as it is generally, is it a huge step to then allow personalised feedback loops so that the local instance remembers its own "personality"? (I'm not an expert so maybe that is a huge, unrelated leap in a way that I can't see?)

It's not complicated to explain. The model can handle 4000 tokens at once. So all you can do is work with the limitations of this window. You can use part of it to quote the previous interactions, and part of it for the response. If your content is too large, you need to summarise it. There are AIs for that too. If the output is too large, you need to split it in multiple rounds. It is pretty hard to work around this limitation, for example to write a whole novel.

I think we need LLMs capable of reading a whole book at once, about 100K tokens. I hope they can innovate the memory system of transformers a bit, there are ideas and papers, but they don't seem to get attention.

Is there a "law of tokens" growth for LLMs, ala Moore's Law, but for LLM capabilities based upon token capacity?
Complexity is quadratic in sequence length. For 512 tokens it is 262K, but for 4000 tokens it becomes 16M and goes OOM on a single GPU. We need about 100K-1M tokens to load whole books at once.

Since 2017 there have been hundreds of attempts to bring O(N^2) to O(N), but none of them replaced the vanilla attention yet in large models. They lose on accuracy. Maybe Flash attention has a shot (https://arxiv.org/abs/2205.14135).

Sure, that is chatGPT in late 2022.

What about Open.ChatGPT in mid 2024?

You are focusing on the application not on the capability. ChatGPT was trained for a different purpose so if you used it for discord it would be recognizable.

Not unlike a person that wrote newspaper articles all his life and now has to use discord.

If you trained it on discord messages it would probably be indistinguishable from most discord users.

> If you trained it on discord messages it would probably be indistinguishable from most discord users.

You might get an average of all discord users, or all discord users at once. Neither would seem like a real person.

For instance someone who is interested in and capable of talking about any topic will not look real. A normal discord user will not contribute to conversations they don't care about or know nothing about.

After 10-messages-long discussion - probably (I'm not sure but let's say you're right). But you won't have 10-messages-long discussion with anybody if there's 1010 users and 10 of them are human.

AFAIR discord requires phone number, and that's the main reason spam isn't a problem, so maybe we're already there.

> But you won't have 10-messages-long discussion with anybody if there's 1010 users and 10 of them are human.

In such a place it doesn't really matter whether anyone is real, does it? You're probably there to get a few laughs or something, not build relationships with people.

Though there's more serious places with real people. Off the top of my head there's two large discord servers I'm active in: One is to find people to play a certain video game with, the other is a community of developers who use somewhat niche technology and probe each other's brains for knowledge that often can't be found on the internet. In either community a chat bot would be immediately obvious.

I'm using discord a lot, I'm on around 20 servers. Most of them are in the 10-50 users range and they are for specific purpose (like playing D&D or computer games, remote working in a small team, talking with a group of friends). These won't be affected cause they are invite-only.

Then there are 1000+ users open servers - usually for developing some open source projects or talking about $RANDOM_HOBBY. These would definitely be affected if not for the requirement of giving your phone number to access discord.

You can imagine the problems that 1000 bots posting proceduraly generated bug reports could cause :)

It will probably look like a proper conversation until you start looking at the usernames.
> ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Wait until GPT4 comes out next year...

Whatever helps you sleep at night.
I think it is not that far away, my prediction is

- the value of real face 2 face interaction for both business and personal life will go up again.

- We will see more of twitter's pay to participate models as an attempt to verify real human beings.

- Online advertising will waste trillions of dollars and be late to realize they are selling to bots.

Not clear how to deal with this either - you can improve authentication but it won’t prevent the properly auth’ed users from running LLMs. You can watermark the output of officially vended LLMs (Scott Aaronson seems to be working on that) but nothing is gonna prevent people from running non-watermarked versions
Its basically too late, as soon as one rich person decides to train a model and dump it, it's game over. I feel we might actually be experiencing the last months of an internet where you can expect to be talking to a real human on the internet.

Every year the cost of training these models drops so they won't be out of reach for long.

Eternal September in late 90s(?)

Now we're coming to an Eternal (AI) Winter..

Spooky to think about, but you could be right. :-/

I, for one, welcome our new robot overlords.

Too long already has been the struggle to distinguish the phraseologists from those who think. Finally here is the societal pressure to develop means to distinguish what brings progress from all the semantic sugar that merely brings feel-good.

You can just ask "Are you a LLM? What is <large number> + <large number>?" to reveal.
> I fear the day that we get a run on your own machine version of ChatGPT.

I can't wait. I think we may see what Vernor Vinge envisioned - that we'll use AI-ish tech on our own computers to mediate between the massive flow of information on the internet and ourselves.

On the bright side, we can use this tech to keep spammy sales representatives busy for years.
Time to invest into Facebook where users are clearly identified.

How can HN and its open registration survive?

doesn't help.. you can "turn" existing users..
Seems like the zombie apocalypse is finally coming, but it's not exactly like in the movies.
This is coming, I've been thinking that you should perhaps start to be skeptical of any content created by accounts later than 2022.

Thinking of online reviews etc. where you don't have a proper log and can't follow a user properly.

They are really bad as is, but they could be made completely useless in a millisecond. Even proper reviews with images etc could be absolutely trivial to make and for $10 you could automatically generate rave reviews for your product and trash any number of your competitors.

We will probably need to identify users somehow, which will only serve the centralized FAANG version of internet that at least I despise.

This is my worry as well, and I think the governments (out of all entities) could assist here.

I (EU member) already have an ID card and a government issued electronic signature, perhaps a service like reddit could just ask me to sign something to verify that I'm a real and unique human being.

Of course there are all kinds of risks of people being hacked, no throwaways, leaked signatures being traded, government refusing to issue IDs to whistleblowers on the run, and so on.

The api for gpt-3 is pretty cheap (and even cheaper on alternatives https://textsynth.com/) Either it's already happening or people don't want to use resoruces to destroy the internet, or idk what.
their bullshit pointless essays are out of place even here
This is an early plot for a blade runner scenario.
Requiring a phone # might be a good starting point.
You're not necessarily wrong, but you have to admit there's something funny about solving the problem of cutting-edge artificial intelligence so advanced it can pretend to be human... by requiring a telephone number. Mashup of the newest tech with some of the oldest, I guess.
Eternal January ...