Hacker News new | ask | show | jobs
by blinkingled 612 days ago
If anyone has an idea how in the modern day one could run a search engine, be sustainable and provide quality results without being influenced by advertisers, SEO hacks and also maintain security around it all - infected pages etc. - they are not on HN and are probably too busy implementing their idea. Kidding aside as a thought exercise what can people think of to kind of sort of make modern search engine work?

I am asking because I get the feeling that there's not much you can do and Google's way is the only thing anyone can do and still be in business. Now how much they rely on ad revenue and how much more they have to compromise is another story. Probably Google should be thinking of including better search to paying Google One customers?

Biggest problem is inertia - that would have to be solved - one way is that Google search gets useless and people have an option to move to something better. I doubt this will happen. Other way is someone builds not just a marginally better product but a 10x better product and people move there - I don't see that happening anytime soon either.

3 comments

We could have an “organic” search engine which only positions itself for “normal” webpages, and defines normal. For example, the webpage should have a title and max 4 paragraphs, on the topic, and shouldn’t include storytelling. Technical problem-answer oriented pages should only contain various aspects of the problem, but no storytelling either. Fewer phrases to index, more density of keywords, easier to index. And maybe we should come back to the rules of 1. speed 2. content being in the original HTML 3. the react hydration shouldn’t dilute the HTML.

It doesn’t matter what its artificially-defined rules are good. But people would enjoy going there better than on Google, because you’d find the organic pages.

It doesn’t matter that Google would also index them. It’s like the Panamax, it defines rules but others can use those rules to.

No need to reinvent the wheel. Just exclude from the index any websites which trigger the behaviour of uBlock Origin. Job done.
They're all doing proprietary work for LLMs.

The question is how you can be useful enough to the public without being so useful a LLM maker will snap you up

separate the ads from content, into individual columns, and never, ever shall the twain meet.
Ads aren't the real problem though, SEO and dark patterns are. Ads, even when mixed in with legitimate content, are labeled. A user can reasonably infer an ad website's goal when they click on it: to get you to pay for something.

The real problem is the system enables bad actors to do the same, without gaining the Ad label, by gaming the system to outrank legitimate (or free) sources of information.