| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Atlas22 1202 days ago

Yeah, ultimately this is what turned me off. Used the beta and loved it for as long as it lasted. When it ended it was hard to migrate elsewhere but I couldn't convince myself it was a good idea to continue investing in a service that was entirely at the mercy of its direct competitors and shows no signs its working to reduce that risk and cost.

Unless they mitigate those risks they will only exist for as long as google or bing wants them to. The only ways they survive are: - Mitigating those risks and costs (e.g., building/using own index, well designed caching could help) - Staying small enough in terms of searches and users to be under the radar for Google and Microsoft - Pray for the mercy of two of the most ruthlessly anticompetitive companies in existence (laughable) - Convincing Google or Microsoft that they are worthwhile to acquire (but this kills the service for me anyways)

Price hiking +150% for the stated reason that my direct competitor increased my costs certainly shows the pressure is on and working as intended. On the off chance that kagi devs or management reads this, PLEASE find a way to isolate yourself from being totally reliant on google,bing,etc. Unless you are going for an acquisition exit from Google or Microsoft, it will kill your company eventually.

1 comments

jlund-molfese 1202 days ago

They have their own index[1]. It's not easy, when a bunch of sites block anyone who isn't Google or Bing. But this is the same strategy Brave seems to be pursuing, where they try to rely more and more on their own indices.

[1] http://teclis.com

link

mdaniel 1201 days ago

> The crawler is hybrid, using async python requests and puppeteer with uBlock Origin. The way detection works is we count the number of uBO blocked requests on the page, and if too many (threshold is set to 5), we kick it out, leaving only "clean" pages in the index.

Fascinating; cnn.com reports 47 on the front page, npr.org is at 16, developer.hashicorp.com is at 9. I don't think that metric is doing what they think it is, or rather maybe they're trying to target only savanna.gnu.org style sites or something

link

Atlas22 1201 days ago

Good to know they are working on this.

Is there a legal issue with spoofing user agent to be the google crawler? Spoofing is certainly enough to get rid of article paywalls for 99% of sites Ive encountered. At least last I heard you can also work around cloudflare captcha by just routing requests through a worker on their service.

link