Hacker News new | ask | show | jobs
Laid Off from Google Search?
118 points by _pjee 1245 days ago
I just got laid off from Google search after working on the ranking team for four years. Wanted to see if there's anyone out there that wants to build a search startup, especially if you're ex-Google. I have some ideas for how to build a search engine like 2010 Google, that's relatively low tech (i.e. achievable). If you're interested email me at username @ Gmail.
18 comments

2010 low tech Google is _exactly_ what I'm looking for.

Everyone else in the search space (even paid search - kagi I'm looking at _you_!) seems to be under the impression that it was abandoned because it didn't work but I'm pretty sure it was only abandoned because it didn't drive ad revenue.

Moar power to you mate, if I wasn't busy with other things right now I'd be beating down your door!

But it's no longer possible to operate that way as they were getting gamed by SEO's.

Today may look bad for searchers like us but it could be a lot worse. It's a constant battle.

Kagi seems to be returning pretty good results, from what little I've tried it
I tried it earlier today, after seeing it mentioned in another thread. I gave it some queries where Google had recently frustrated me by returning nothing but affiliate marketing listicles.

It returned the same results, but with some of the listicles helpfully sorted into a "listicles" section. The rest of the results were also listicles.

It's entirely possible the content I'm looking for doesn't exist, drowned in a sea of SEO spam, but then I have little motivation to pay for a search engine to find it.

On that note, does anyone have a good site and/or strategy for "I would like to buy a [product] with [two features that are uncommonly found together]" searches? Those seem to be the hardest ones to get good results for, because of the number of "BEST [product] OF 2023" lists that aren't filtered to what I actually asked for.

When I get listicle spam I tack "site: reddit.com" onto my query and usually get pretty organic, opinionated results. But of course you might come up dry if what you're looking for is really obscure.
Be aware reddit is being "astro turfed" as they call it too
Really, are you sure?
Probably a good plan to ditch the @gmail mailbox.
Oh, c'mon! what's this fear-mongering?

Google would need to break criminal laws (and public trust) in order to look at someone's private gmail address for the purpose of snooping on their plans / their conversations.

They just have to do their typical "sorry your accound has been suspended for violating our terms and conditions. Which one? we won't tell you" to screw with him.
And avoid communicating with anyone using a @gmail as well?
And not getting locked out of account.
I'd just say: Careful about non-competes and potential trade secret violations. Regardless of their merits, if Google wants to take you to court, they could bury you in legal costs before it ever goes to trial.
Given I'm going to take a completely different approach that doesn't use anything proprietary, and non-competes aren't enforceable in California, I'd assume this is ok?
You should also be aware that as long as you’re an Alphabet employee (read: receiving WARN act salary payments) you’re still bound by any IP agreements you signed to start employment.
The non-compete isn't an issue but the "trade secrets" could well be. At the very minimum, make sure you and everyone who joins you is 100% clean for the past 12 months in terms of never transmitting or storing any work-related content outside of Google corporate devices/emails.

Best of luck, the world needs you to succeed.

At the risk of repeating myself:

"Regardless of their merits, if Google wants to take you to court, they could bury you in legal costs before it ever goes to trial."

All I'm saying is be careful. It might be worth consulting with a lawyer to see if there's anything you need to do to protect yourself.

And honestly, it could be a non-issue. But it'd be better to hear that from legal counsel.

It will be ok, after you've spent millions in the court proving that it is ok. Shouldn't happen before you're a threat to them though, so maybe you'll have the money by then.
"Google accused of laying off employees and then suing them for violating non-disclosure agreements". Definitely not a PR nightmare.
Not any worse P.R. nightmare than laying off ten thousand engineers. Nobody cares about engineers man.
Honestly, I really doubt Google would care. It's not like people will stop working for them or using their products.
Definitely op has no clue about all this else wouldn't have posted in a public forum like this
Sorry to hear that. Kagi is doing great things. If you really want to work on search (that doesn't suck), maybe reach out to them?
Been hearing a lot about Kagi on here, might reach out. My concern is that they're trying to beat Google at its own game by being better at scraping, document understanding and question-answering.
From what I heard from their CEO (Vlad). No they don't try to do that and they don't even invest heavily in web scraping.
I don't think so. I was surprised to learn on their Discord that they rely heavily on paid results from Google's API and other search APIs. Not sure what they do with it but it's pretty amazing how they improve the results.
Don't think Google has a paid Search API. Bing does though with Azure.
From Kagi's FAQ (https://kagi.com/faq):

  > Where are your results coming from?
  >
  > *Our searching includes anonymized requests to traditional search indexes like Google* and Bing as well as vertical sources like Wikipedia and DeepL or other APIs.
  > We also have our own non-commercial index (Teclis), news index (TinyGem), and an AI for instant answers.
Emphasis mine.
Right, that's scraping results, not from an API that doesn't exist. What exists is Custom Search from Google and that's an API to search results on your own website.
I was looking into that, looks like they have an API but it's limited to 10k requests per day, and TOS says you can't blend the results with other sources.
Can you share any insight into how Google are deciding who to lay off? I would have thought that search is a pretty safe team to be on.
> Can you share any insight into how Google are deciding who to lay off?

I suspect you want to ask the decision-makers, not the victims, that question.

There is a Slack group related to search at http://www.opensourceconnections.com/slack

Great community. I would love to join something from the start, but now is not the right time.

Pefect. I joined, thank you.
Neeva.com is a bunch of ex-Googlers yah?
Interesting title and message. Seems like you're targeting xooglers specifically. Why? Are you surprised to be laid off?

Google ranking hasn't been very good in the recent years. You must have some ideas to improve it that you plan to use for your startup.

Good luck.

Have you seen https://crowdview.ai?

It's a Forum Search Engine I've seen pop up on HN a few times in the past.

Thanks for the interest and feedback everybody. I won't be seriously having conversations about this or working on it until summer.
I would be careful working on this sort of thing during the next 2 months while you are still technically under the employment of the company.
Yeah I was thinking the same thing.
https://knuckleheads.club/ might interest you.
I’m a grad student in computer vision, would you want someone like me to help out, or are you looking for more established folks?
I'm probably not google-level, but I have an idea to recreate ask jeeves, where at first it brings back all the results and people check the box next to the ones most likely to fit their need, using reinforcement learning it'll eventually know the best ones, and then have gpt3 summarize the top 10, and eventually just answer the damn question. The goal being chatGPT+Google+wikipedia(citations) to combat chatGPT's misinformation issues.

Another idea is to basically create LLM's for every city's laws/legislation/codes etc and basically be like an ai lexus nexus.

> at first it brings back all the results and people check the box next to the ones most likely to fit their need

My friends+colleagues would 100% write bots for that in less than a week. You'd be completely swamped with fake feedback to SEO/game the results. The bots will beat every CAPTCHA you have and have completely normal human-like client fingerprints.

Unfortunately, having watched so many mass-input rating systems fail by being gamed, I suspect that this strategy might only really work well if the reinforcement model was built and maintained independently for each user...which seems like an approach that could be very effective but which would be more difficult and likely costly to scale.
I'd imagine the search engine of the future will be like that. Let a summarization AI do the search and then report back on results, with citations.
Kagi has a technology demo that does exactly that, with varying rate of success:

https://labs.kagi.com/ai/contextai

Google search blows.
i love how you have to be smart enough to s/hacknewsusername/username to email this person
More than that, you need to be smart enough to write the sed substitution the other way around.
That is not a high bar.
Hacker news doesn't have dm right?
no but you can put your email in your profile
nope, you don't even require an email for a new account.
Could ChatGPT figure it out?
< What is the email meant in the following text

"Laid Off from Google Search?

21 points by gregw134 1 hour ago | flag | hide | past | favorite | 13 comments

I just got ...

...

... email me at username @ Gmail."

> The email mentioned in the comment is "username@gmail.com"

< What is the email I can use to contact this user? he didn't wrote it explicitly. If it helps, his username is gregw134.

> ... It only says "username@gmail.com" which is not a valid email.

Looks like he dodged this one!

Wait, aside from not penetrating the substitution, ChatGPT thinks username@gmail.com is an invalid email address?
Everyday the expectations for ChatGPT get weirder...
Nice! I was just hoping to beat regex scrapers.
Search without LLM/GPT is just dead in the water now, isn't it?
It depends on your goal. If the focus of your search engine is question answering (like Google), ChatGPT is serious competition. Browsing the web to discover online communities and niche websites is an entirely different activity that ChatGPT doesn't compete with, and frankly Google doesn't do well either.
Yeah discovery is in such a messy state right now. I don't think anyone is doing it well.

"Give me facts about X" there are several decent options for.

"Give me quality reading material about topic X", there's like nobody even trying. Not that an answer doesn't exist for the query, just nobody is able to produce it.

This goes well beyond websites, discovery sucks for streaming video, for shopping, almost everything.

I think one issue is broadness. When a user issues a query there is often a choice of 1) low quality material that directly answers the specific question, or 2) high quality material about the subject more broadly. If someone asks "how do I fix water in camera iphone 3", if someone has set up a content farm for that exact question Google will probably send the user there, rather than to a broader repair page from apple.com. There's tradeoffs in search ranking, which is why I think it's possible to make a search engine that takes the opposite side of the tradeoffs that Google has.

Nice job on marginalia.nu btw. Your site is a good reminder that there is a whole internet out there that just isn't able to rank on Google for various reasons.

I asked ChatGPT for webcomic recommendations and it gave me imaginary ones. No trace of them anywhere on the web.
I can't decide if LLM/GPT is a seismic shift in the "search industry" or if it's just a gimmick. For now I'm considering it more of a gimmick that everyone is just very very excited about.
I think the problem is the popular media coverage makes it hard to allow people to actually focus on the value this actually brings. I work in ED and its astounding how many people fundamentally think basically the school is going to shutdown imminently (clearer heads are prevailing... for now).

But the tool clearly has some value, which may even reveal itself to be quite significant. I think a useful metric is to look at what happened with Copilot, which is a domain where there has been a lot less media frenzy about, and in which, arguably, this kind of model could've have much more readily made a tremendous impact. I think even in the dev community we went through a small period of folks thinking this was going to be earthshattering, followed by a natural cooldown, followed by a probably much saner interpretation that it is a tool, and in the right contexts might actually be useful.

yeah I think you're spot on. I don't think GPT is earth shattering but it's also hard to tell with all the media frenzy about it.
I've always wondered: Is LLM/GPT already how Google is generating answers to "People also ask" questions or are they using a prior, or unknown, ML approach for this? I ask this because the quality of answers linked to these queries is often so astoundingly awful that I can't believe this is in production.

The question is often very relevant. I'll readily admit that I have a high engagement with that accordion feature, but I can't believe how often I open it to find a disappointing text selection or even page.

And if it is, I wonder where that leaves JSON-LD schema in all of this. Schema is the perfect signal for something like this, but I'm afraid, and I believe I can speak representatively about this from an agency perspective, the trust is kind of broken for that model.

Too many people, myself included, are uneasy about how much information to give to Google since they have an insidious aim to use it to get information to users faster regardless of how much impact it has on a business' ability to remain competitive. Yet, on the other hand, I sympathize with the idea that the more Google reverses course on this and leans into embracing SEO industry-driven control they risk compromising the product. It makes me think that Google has reached a theoretically maximum level of product optimization.

I think Google has built up a knowledge base that they query for those answers. And yes, it's embarrassingly wrong so much of the time
Definitely not a gimmick.

A lot of questions I would have googled, I now ask ChatGPT.

It's not always reliable of course but it's often quicker and somewhat more extensive and specific when I need code snippets for example.

I think it's complimentary technologies. It will definitely eat up search market share.

It's almost as if yesteryear encyclopedias were having their revenge.

If it's useful it isn't a gimmick.
Search is one of the least interesting applications of LLMs. Most of the complaints about search are self imposed, not technical problems. Why are we still talking about a 20 year old problem that is basically solved?
Because it’s not solved for users?
Hi Greg,

I would love to hear about the problem space you are exploring. Here is my linkedin: priyankaja