Hacker News new | ask | show | jobs
by sebzim4500 454 days ago
This is a great riddle. Unfortunately, I was easily able to find the exact question with a solution (albeit with a different number) online, thus it will have been in the training set.
3 comments

What makes this interesting is that while the question is online (on reddit, from 10 years ago) other models don't get the answer right. Gemini also shows it's work and it seems to do a few orders of magnitude more calculating then the elegant answer given on reddit.

Granted this is all way over my head, but the solution gemini comes to matches the one given on reddit (and now here in future training runs)

65×26×39=65910

>Gemini also shows it's work and it seems to do a few orders of magnitude more calculating then the elegant answer given on reddit.

I don't think Gemini does an unnecessary amount of computation, it's just more verbose. This is typical of reasoning models, almost every step is necessary but many would not be written down by a human.

Seems like we might need a section of internet that is off limits to robots.
everyone with limited bandwidth has been trying to limit site access to robots. the latest generation of AI web scrapers are brutal and do not respect robots.txt
There are websites where you can only register to in person and have two existing members vouch for you. Probably still can be gamed, but sounds like a great barrier to entry for robots (for now).
What prevents someone from getting access and then running an authenticated headless browser to scoop the data?
Admins will see unusual traffic from that account and then take action. Of course it will not be perfect as there could be a way to mimic human traffic and slowly scrape the data anyway, that's why there is element of trust (two existing members to vouch).
Yeah don’t get me wrong I believe raising the burden of extraction is an effective strategy I just think it’s been solved at scale ie voting rings and astro turfing operations on Reddit - and at the nation state level I’d just bribe or extort the mods and admins directly (or the IT person to dump the database).
It’s here and it’s called discord.
I have bad news for you if you think non paywalled / non phone# required discord communities are immune to AI scraping, especially as it costs less than hammering traditional websites as the push-on-change event is done for you in real time chat contexts.

Especially as the company archives all those chats (not sure how long) and is small enough that a billion dollar "data sharing" agreement would be a very inticing offer.

If there isn't a significant barrier to access, it's being scraped. And if that barrier is money, it's being scraped but less often.

Honestly someone should scrape the algebraic topology Discord to AI it'll be a nice training set
Or we could just accept that LLMs can only output what we have put in and calling them, "AI" was a misnomer from day one.
Why would you accept a lie?
I'm not sure what you mean but I'm trying to say our current LLMs are not artificially intelligent and calling them "AI" has confused a lot of the lay public.
Why is this a great riddle? It sounds like incomplete nonsense to me:

It doesnt say anything about the skill levels of the participants, whether their answers are just guessing, or why they arent just guessing the sum of the other two people each time asked to provide more information?

It doesnt say the guy saying 65 is even correct

How could three statements of "no new information" give information to the first guy that didn't know the first time he was asked?

2 and 3 saying they don't know eliminates some uncertainties 1 had about their own number (any combination where the other two would see numbers that could tell them their own). After those possibilities were eliminated, the 1st person has narrowed it down enough to actually know based on the numbers shown above the other 2. The puzzle could instead have been done in order 2, 3, 1 and 1 would not have needed to go twice.

I guess really the only missing information is that they have the exact same information you do, plus the numbers above their friends heads.

> The puzzle could instead have been done in order 2, 3, 1 and 1 would not have needed to go twice.

If this is true, then back in the original 1->2->3->1 form, shouldn't person #3 have been able to answer it?