Hacker News new | ask | show | jobs
by travisgriggs 1611 days ago
I totally get this. Back in the day when I was a kid, we went to the local library and read about the world. When the librarians weren’t serving me by “checking out books” to me, they were busily putting new and improved titles on the books in receiving.

/s

Seriously. Google is starting to feel less like the librarian of the net (we index the world) and more like the Truman show: we craft your reality.

8 comments

It’s the ads. The way Brin and Page phrased it in their 1998 paper, they considered ad-oriented search engines to be lower quality. They were going to be more academic. They thought that there was lots of user data to mine in search…for academic purposes. Then innovation #2 at the actual startup was the ad auctions and that was the beginning of the end, all the way back at the beginning.

I’ve recently read a lot about hedge funds, and it’s astounding how many scientists literally say, “I don’t think hedge funds add value to society, I wouldn’t work there.” And then the firm slides this check across the table, and they didn’t even realize a single check could have that many zeroes, and they join the firm and stay forever. That’s what happened with Google and all the rest.

Agreed. The industrialization of ad tech has been a loss for humanity. It’s a runaway mechanization at this point.

What I don’t understand, is why we don’t tax it. If an industry generates lots of wealth, but has a questionable impact on society, the “f(r)ee market” west’s response has usually been to throw a stiff vice tax on it. It doesn’t make the vice go away, but it puts a governor on its excess and redirects some of the spoils for projects which hopefully are net positive.

Doesn't the tax usually come when the consensus about the societal harm carries more weight than the money produced by that harm?

Or at least enough weight to be competitive. Sin taxes have a way of permanently tying the sin (at some reduced level) to the general budget.

I don't think we're there yet. People can get plenty mad at "tech" without connecting the ad-tech dots.

Well, you should try to establish the societal cost of the negative externality and then tax at that level. The idea isn't to destroy the thing but to make it's price reflect its actual cost

Edit: "then cost" => "then tax"

It's the exact same thing with almost every "technology" company out there today.

We're sinking our best and brightest (and also plenty of perfectly useful and adequate) talent into getting people to look here, buy something they don't need, or press button.

It's comically to contrast that with the same people who pretend climate change is an existential crisis. Meanwhile, so many scientists and engineers idealistically interested in that, leave for software-related subjects where they'll make 10X the money making the problem worst.

One obvious solution is to pay them more.

If you're not the principal investigator, a NIH grant will pay someone with a PhD + 7 years of experience...$65,292 with pretty weak guarantees on job security (etc).

"Then innovation #2 at the actual startup was the ad auctions..."

I don't think Google quite invented those, GoTo/Overture invented ad auctions and pay-per-click, but missed out on patenting them. Google did improve on the idea, with the second-price auctions.

https://slate.com/business/2013/10/googles-big-break-how-bil...

> we craft your reality

As mentioned above. It's also the AI.

Ads are not the fundamental problem. The fundamental problem is tracking. More on that here and about search: https://www.mojeek.com/support/ads/

Ads are a fundamental problem. They skew the incentives.

The search engine could, for example, give semi-poor results, making the person search again, increasing ad impressions.

An ad-supported search engine would also prioritize pages with ads that are also conveniently sold by the search engine.

As a user, I want a search engine to give me the best page with the fewest searches. An ad-supported search engine wants me to view more ads and click on them. Those are, if not orthogonal, often in conflict.

>An ad-supported search engine wants me to view more ads and click on them

Is this really the case? Assuming pay per click model and rational and competent advertisers. More clicks would increase their costs and reduce the generated value per click. The advertisers would limit their maximum cost per click. This would limit the revenue of the ad-supported search engine to the previous level (from before introducing bad search results).

It is possible that more clicks (generated by tricks and bad search results) produce more revenue for the advertisers. This would (slightly) benefit both the advertisers and the search engine.

In the end and in the long run the incentives of the ad supported search engines are alligned with their customers ( advertisers ) if the above assumptions are met.

Hedge funds are totally over. What are you talking about?

https://www.investopedia.com/managing-wealth/hedge-fund-over...

So you are talking about a different source, than the one you linked?

Because this is the summary of the article.

"Is the hedge fund over? It's difficult to say."

The idea that ads affect Google's search ranking just isn't true. There are purposeful barriers between ads and search at Google to prevent this, such that the ads team can't even file bugs with search.
> the ads team can't even file bugs with search

I don't think it would happen at the low level of engineers filing bugs. It happens at the highest levels of management, where the main concern is corporate profits.

Even if there's no explicit cooperation or algorithmic link between the ads and search divisions, everyone on the search management team knows that search is a huge, expensive operation that makes no money on its own. Advertising is what pays for their salaries, bonuses, operating expenses, etc., and you can bet that they make their executive decisions accordingly.

There’s a grain of truth here, in a bit of a tangent: librarians classify all books using a system like Dewey Decimal or Library of Congress Classification.

While not adjusting titles, librarians do have some influence on how a book is classified and thus filed/organised within the library. Check out the wiki article on Dewey[1] for the various options for homosexuality, which has numbers for it including under areas including mental illness! Depending on the library systems leanings you may still find it there or the section for sexual disorders or hopefully in the sexual relations area. (Disclaimer: I just used this as an easy example because it’s on Wikipedia)

1. https://en.m.wikipedia.org/wiki/Dewey_Decimal_Classification

The Dewey Decimal classification system is ridiculously flawed, and no self-respecting library uses it these days (unless it always has, and hasn't got around to re-organising). Even my school's little one-room library didn't, something I found annoying at first, but came to appreciate.
Disagree that it’s ridiculously flawed. It has issues like any system, but it still works well the majority of the time.

> no self-respecting library uses it these days (unless it always has, and hasn't got around to re-organising).

The vast majority of library systems have been around long enough where Dewey was the defacto choice (or LCC). Just checked a few like the British Library, the French National Library, and all the other libraries I’ve looked up now in London, all Dewey.

"Libraries in the United States generally use either the Library of Congress Classification System (LC) or the Dewey Decimal Classification System to organize their books. Most academic libraries use LC, and most public libraries and K-12 school libraries use Dewey." [1]

[1] https://www.usg.edu/galileo/skills/unit03/libraries03_04.pht...

What are some alternative systems? I'd expect that any categorization system for content needs to make subjective choices.
Library of Congress is the standard for academic and professional institutions, at least in the US.
Those numbers were added as a consequence of the books that needed to be classified in the 1930s, and now that there are books that don't belong in the category there are new numbers.
It still comes down to librarian interpretation. Sometimes they will just defer to another source, like the national library of their country, or the publishers recommendations, but at least in my experience working part time in a library many years ago, the librarians out back doing the processing and cataloguing would refer to Dewey index guides and also make judgements based on the nature of the book (eg mostly practical vs theoretical nature would be the difference between a 6xx filing and somewhere completely different).
> Check out… the various options for homosexuality, which has numbers for it including under areas including mental illness!

Classifying a new technology is another major area where the original taxonomies need to be extended in order successfully index material. The internet, for example, didn’t exist at when LC/Dewey were originally defined.

Last time I was in a public library books about the Internet were next to books about UFOs!
Yes and no. Anecdotally, most of the SEOs and ppl who do SEO "part time" (e.g., ecomm store owner) still don't understand the foundation of modern SEO.

1. Google doesn't care about the sites. The sites aren't Google's customer.

2. More importantly, the person doing the search is the customer.

Unfortunately, most sites believe SEO is about them. They can improve how they present themselves but the "transaction" is not about them.

Google, serving ads aside, needs to maximize customer satisfaction or run the risk of losing a customer.

It's worth repeating: Google doesn't care about rhe sites.

If Google believes a site's content is a good fit for maximizing customer satisfaction, but the title isn't optimal then it makes perfect sense Google would want to optimize the title, if the title is the "gateway" to a happy customer.

Whether that's right or wrong, IDK. Whether it actually helps, again IDK. But from a pure relationship / business perspective it makes sense.

1. Most SEOs don't care about what Google cares about. The sites are their customers.

2. SEO is about the sites and Google and other sources are just that: means to an end.

It's all a matter of perspective.

No actually it's not. You're (gravely) mistaken.

The customer is the person doing the search. The sites aren't viewing the ads. The sites aren't clicking on the ads. *That* is Google's #1 source of revenue. Full stop.

And thus, as originally stated and supported, too many practicers of SEO, with that wrong lens, continue to misunderstand Google.

Put another way, Google doesn't change the title for the benefit of the sites. There's simply no biz model / source of revenue to support that idea. None.

Given how site owners habitually attempt to distort reality with tag stuffing and other bullshit metadata, what do you expect? Reality is not what is printed on the tin.
Should I ask Walmart to kindly start relabeling products on their shelves because what’s on the tin is rarely as good for me as what the maker purports?

Maybe that’s what we need. An FDA metadata label for every website served, kinda like the fav-icon, but useful.

- Readable word count (protein)

- Ad count (fats)

- Image count (carbs)

- Embedded script size (the list of nasty sounding chemicals it contains)

- Average data transmitted (sugars)

- etc

Must be shown in black text on white background with a black border. Sorry dark mode guys.

This might actually be the killer app for AR. Reviews of products as you look at them on the shelf.
Rather than showing the actual reviews, just lower the color saturation for lower reviewed products. So high reviewed products would pop in a sea of gray scaled items.

Sounds like something out of Black Mirror, but could be interesting.

I can’t wait for “This product is awesome 5/5 btw I don’t own it” and “My favorite 1/5” in AR
Vivino kinda does that but for wine only. You can scan any bottle with it and it shows you its rating based on user reviews.
Sounds like it might either decrease sales, or increase the manufacturing of fraudulent or shill reviews.
I'd expect Google to downrank sites that are trying to manipulate the system. Not rewrite them.
How could that work out? Low quality sites usually have more juicy ad spots?

More seriously: incentives are stacked against search quality these days. Poor results means more trips into ad laden wastelands, and more returns to the ad laden search results page.

Giving people the result up front and center would directly affect quarterly profit I am afraid.

At least this is the model that makes most sense to me.

The next most probably is machine learning is already out of control and the people who created it left.

Edit: wild speculation of course.

>...they were busily putting new and improved titles on the books in receiving.

The book titles are unchanged (when you visit the site) - this is just the Librarians adding synonyms and/or simplifying titles in their catalog so that it is "Dr. Strangelove" rather than "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb".

That's kind of the fundamental insight here though, isn't it? Carnegie developed his libraries as a philanthropic endeavor, with the aim of supporting meritocracy in society. Google developed their library with the aim of making a shit-ton of money from advertising.

Google never was a suitable candidate for the world's authoritative librarian. Unfortunately, we'll probably need another Carnegie to displace them.

>and more like the Truman show: we craft your reality.

It should be more like an assistant: here are some boring tasks/questions, find out everything about them, summarize it for me and present me my options. I dont't really want to search or find something. I want to get things done, questions answered.

> Seriously. Google is starting to feel less like the librarian of the net (we index the world) and more like the Truman show: we craft your reality.

"Feels"? This has been the reality for many years.