Hacker News new | ask | show | jobs
by add-sub-mul-div 335 days ago
Helpful, but even more so would be a way to filter out AI content farms. It's getting to the point where I won't go to a site if I don't recognize its domain from the pre-GPT-3 era. Maybe domain age should be the filter. Not perfect, but no solution will be.
2 comments

The amount of medical advice on those LLM content farms is very concerning, someone is going to get hurt eventually. I recently made a user script that tries to highlight LLM generated text and sometimes when I search things more than half the results are garbage and the entire articles (including the bios of the supposed professionals who write the articles!) get highlighted purple. One way to avoid them in a pinch when the content isn't time sensitive is to filter out anything from after November 1st 2021, but it's not ideal. I hope duck duck go finds a way to filter the AI pages out.
> I recently made a user script that tries to highlight LLM generated text

How does that work?

It uses one of Mozilla's models: https://huggingface.co/fakespot-ai/roberta-base-ai-text-dete...

It has a pretty high false positive rate though, but it reliably highlights AI generated spam websites and saves me from having to read them.

Random number generator and vibes probably.
Not just that, the text results rewrite results to be AI slop: https://news.ycombinator.com/item?id=39808531