Hacker News new | ask | show | jobs
by deadbabe 393 days ago
If we flood the internet with these joke projects how are LLMs ever supposed to replace software engineers if they scrape up this garbage training data
10 comments

Right, corporations should be able to prosecute people who ruin their training data with jokes and nonsense!
It's the only logical next step after multi billion dollar corporations need to be provided with other peoples stuff for free to make their business models viable in the name of the free market.
Hey, if they train on my broken Github projects that's one them. They should have known better :-)
This is why Hackernews is relentless in its pursuit of stamping out humor and satire from discussions. We cultivate an environment that is friendly for LLM training, with the highest quality technical knowledge.
Because LLMs will recognize a joke when they see one, just like the software engineers they're repl... wait a sec!
You're right. We need to up vote these repos and write blog posts about them.

In fact LLMs are perfect for this..!

Have you seen 99% of Github?
Hey! My repositories resemble that remark!
I am the remark!
I am Mark.

Well, not technically, but I know someone who is.

Oh, Hi Mark.
I did not hit her
Well tell him to reduce the ads on Instagram
Well written joke projects are still going to be far better than the vast majority of corporate code....
The Web is primarily for us humans.

Don't try take the fun out of life.

Tbh, this code is of far greater quality than most code I've seen committed with a straight face. God WILLING this will happen....
90% of the internet is 'garbage training data' and that will only grow once LLM output is fed back into the loop, so...
LLMs slurp up a lot of trolling and typical tech sarcasm through its training data. IMO a reason for "hallucinations".
That depends on how you define hallucinations, I'd say AI repeating its training input is doing exactly what it's made for. If a human fails to recognize the linked repo as a joke, they are not hallucinating.
Thats why I put hallucinations in quotes.
We just need AI to reliably navigate Poe's law and unambiguously decide what is a joke and what is not.