Hacker News new | ask | show | jobs
by talkingtab 558 days ago
Much of the trash on the internet has nothing to do with AI, but instead is caused by using AdSense type funding. If you have a site and use revenue from ads as your funding, then the way to in increase your revenue is more show more ads.

So add more fluff, move the actual thing people are looking for to the bottom, etc. Oh and add controversy, "The only authentic". Then add sex - a suggestive photo.

The thing is that AI can now generate these sites for you so no need to do anything yourself.

Finally pay Google to feature your ad - I mean recipe - and do other stuff to ensure that real recipes do not steal your traffic. :-)

4 comments

Speaking of recipes, I just tried this on a page with a quiche recipe. The original page was pretty much a novella built around a recipe. OPs tool worked perfectly. Nicely done.
There is a special-purpose tool for this:

https://www.justtherecipe.com/

which was mentioned here a while back:

https://news.ycombinator.com/item?id=42160959

Paprika is a fantastic recipe filing app that distills bloated pages into just the facts.
I’ll check it out.

I’ve just been asking chatgpt for recipes lately and it’s doing a great job. The other night I made béchamel sauce for the first time (cooking for 6 dinner guests!). ChatGPT nailed it.

I’m 2% sad for all the recipe websites it’s ripping content from. But then I remember what utter Adsense cancer they all are. “My mum made this recipe! You’ll never guess step 6!” While being plastered with 8 auto playing videos on the edges of the screen. I hope those websites suffer a firey death.

I mean, that's fair to some extent.

But on the other hand you could have just purchased any cookbook that covers the basics, instead of taking all this web-scaped content without attribution or compensation. I mean, look, I totally get it and I'm certainly guilty of this too - but let's not pretend that we're not basically stealing other people's content here. Much of the time those people running those recipe websites are just trying to cover their hosting costs and make a squeak of money on the side.

A friend of mine tried to set up a website that would host open-source recipes for people - he called it The Open Sauce - but in the end there just wasn't enough input from recipe creators.

Also, and by the way, the top google hit for bechemal is this : https://www.allrecipes.com/recipe/139987/basic-bechamel-sauc.... Few ads, and the recipe is at the top of the page. No life story in sight.

Apparently the recipe for bechemal sauce dates back to at least 1733. I think it’s pretty fairly in the public domain at this point. Those poor “content creators” are also just copying the recipe from someone else, just like chatgpt is. I’m sure I even own multiple cookbooks which cover the recipe - it’s just easier and faster to ask chatgpt than go hunting in my bookshelf.

I feel a little sorry for the good quality cooking websites out there. I’m just so burned by the bad ones that I’d rather skip the Google search. ChatGPT is also a straight out better resource because I can ask followup questions to chatgpt - “How much should I make for 6 people?” / “What is rue, anyway?” “It’s been a few minutes and my milk isn't thinkening. Am I doing anything wrong?” - etc. It’s an incredible cooking aide at my level of skill.

There are interesting parallels between LLMs and downloading pirated movies/shows.

In the first case its a trillion dollar business based on scraping the entire internet and sharing out a lossy, compressed version of the content with no attribution or financial contributions to the original creator. In the second case its a shady, technically illegal practice of scraping DVDs or online video streams and sharing a lossy, compressed version without attribution or financial contributions to the creator.

Maybe Napster just needed VC backing to make it seem legit.

> no attribution or financial contributions to the original creator

This is an interesting idea, but I don't think it makes much sense to apply that logic to classic kitchen recipes. Who, exactly, is the original creator here?

The common recipes I'm asking chatgpt about - crepes, homemade pasta or bechamel sauce - are hundreds of years old. We could extend your metaphor to say that the bechamel sauce recipe has been "pirated" by generations of cookbooks for hundreds of years. Chatgpt is just continuing the well established tradition of recipe piracy, in order to bring these amazing recipes to the next generation of chefs.

After all, allrecipes.com didn't invent bechamel sauce either. Do they make financial contributions to the original creator of the recipe? I think not.

> Maybe Napster just needed VC backing to make it seem legit.

That's more or less what took Uber from criminal enterprise to mainstream.

I'm using Cookbook that's similar, just paste the url and let it import (it works flawlessly >90% of the time for me). Love the layout on tablet, I get ingredients and steps side-by-side which is super useful.
Also Umami - I've had the most success with the widest variety of sites using this extension. Also has an excellent iOS app.
Click to read more
Sure but that garbage is what AI have been trained on.
eventually the internet will become nothing more than grey goo excreted by the end node of the gpt caterpillar
People love them some Faustian bargain.
Let’s also add the fact that most sites cannot afford AI.
Have you looked into OpenAI's GPT-4o pricing lately? They charge $10 for 1M output tokens, or 500k tokens for the price of a $5/mo VPS.

If we assume a generous 2 tokens per word on average (OpenAI suggests it's actually 3 tokens per 4 words), that's still 5 full 50k word novels worth of text every month for the price of a single DigitalOcean droplet.

This pricing cannot be sustainable in the long term right? The LLM companies are currently bleeding money.
It’s not sustainable for anything other than static content generation. Any type of back and forth with LLM is simply too costly.