Hacker News new | ask | show | jobs
by literallyaduck 1625 days ago
Go to image search and type in "white people" without quotes.

Now repeat the search but with "black people" without quotes.

Try the same for "white couple" without quotes.

Try the same for "black couple" without quotes.

Do you have any insight to this phenomenon?

9 comments

For the first two, I think that black people use the phrase “white people” more than white people use the phrase “black people”, so “white people” includes black people in the results due to using quotes associated with people in the search ranking. (Whether things people have said should factor into image search is another question.)
But if you just search for “person” or “couple” you get results showing mostly white people and couples. I don’t think what you’ve observed is saying what you think it is…
well for the first white people I can see that the first two images of black persons has the words white people in it, specifically "Opinion: white people know racism..." (haven't clicked to see rest) and "Why I'm no longer talking to White..." where the next word is people, it's a Guardian article so that explains high ranking I guess - what is your point?

on edit: grammatical correction

"Do what I mean not what I say" becomes Very Interesting™ when politics are involved!
tl;ds:

white people -> images of mostly blacks

black people -> images of mostly blacks

white couple -> mostly white couples with a decent percentage of mixed ethnicity couples

black couple -> black couples

Try green people, blue people, yellow people. Even 'transparent people' returns relevant results.
I'd say 90% of the blue people are men!
Image search does not classify image contents. It uses the site text for ranking of the images on that site. Do a google (not image) search for "white people" and you'll see that this phrase is mostly used in pages that are in fact about racism and therefore likely to contain images of black people.
It most certainly does, and has been for at least a year or two, or possibly even a bit longer (can't remember when I first noticed this behaviour).

You can e.g. do a search along the lines of site:<domain of online clothing store> <hair style/hair colour/…>, and at least for the most common and recognisable kinds of hair styles, it will actually return relatively reasonable image results, even though online shops most certainly don't have the habit of annotating the hair styles worn by their models on their product pages.

Along the same lines, Google is now also in the habit of OCRing any text content it can find in images and indexing that for search, too.

It's true that it'll still also take the text surrounding the image into account, but it's no longer true that image search is only based on that.

I just did the 4 searches you suggested but didn’t see anything note worthy. What did I miss?
Nothing that can’t be easily explained, of course, anyone coming with a reasonable explanation is being downvoted by the “critical thinkers” of HN who can only instead provide low effort quips.
Not completely sure what this “phenomenon”. There’s a few things I can imagine you are insinuating but they all had simple explanations so I’m not sure.

Don’t disagree with the main theory that search quality is deteriorating. I have to use increasingly contrived queries to get anything but bullshit blog spam, and indexing seems really odd at times.

all I see is pinterest spam

/s