Hacker News new | ask | show | jobs
by solso 2243 days ago
Cliqz never masqueraded anything, only in your odd perception of the world. Advertisement as implemented today is a privacy hazard, but there are other ways to do it, client-side, which is what Cliqz attempted. The same goes for data-collection, you can collect all and put the privacy of the users at risk, or collect only signals that cannot be record-linked, which is what Cliqz did.

Cliqz search was never on par with Google -- I build parts of it -- but was getting there little by little. To be more precise, it was getting good enough, to not be a factor. That has some merit given the totally independent index (not relying on Bing under the hood).

Brave the same as Cliqz are trying their best to offer an alternative. If you think you can do better, please do so. Believe, I'll root for you regardless of my opinion about you (we crossed path in the past). Why would I support you, even though that does not mean I use what you build? Because we are in need of having plurality on the Web, the more the better. Unlike you, I do not see the point of speaking bullshit, not sure if out of ignorance or ill-will, don't know, don't care.

3 comments

Your employer being dishonest does not necessarily qualify you, so I'm not sure why it is being taken personally. There are also several incredibly talented people at Google who do awesome things, some even fight for human rights and privacy, but that does not absolve the atrocities Google as a company commits against the human race.

Nor does your team's work on search engines absolve Cliqz of attempting to build a company that is based on pervasive user tracking, anonymized (deanonymizable in the future) or otherwise. I'd rather not address the rest of your personal attacks.

> Advertisement as implemented today is a privacy hazard, but there are other ways to do it, client-side, which is what Cliqz attempted.

https://en.wikipedia.org/wiki/Cliqz#Integration_with_Firefox: "According to the Firefox support website, this version of Firefox collects and sends data to the Cliqz corporation including text typed in the address bar, queries to other search engines, information about visited webpages and interactions with them including mouse movement, scrolling, and amount of time spent; and the user's interactions with the user interface of the Cliqz software. This data is tied to a unique identifier allowing Cliqz to track long-term performance."

Yep, real "client-side", eh?

Even if it was actually client-side, that's cold comfort; the data's still being collected and presumably persisted, and there's no telling whether or not some future software update will make that locally-stored data not-so-locally-stored anymore.

This claim on the Wikipedia is factually incorrect: "This data is tied to a unique identifier allowing Cliqz to track long-term performance."

Thanks for noticing it, we will create an issue.

UUIDs only applies to telemetry, which is not the data being described in the paragraph: queries, scrolling, amount time spend, urls, etc. For this kind of user data (HumanWeb) there is no uuid, neither implicit or explicit.

There are plenty of papers on the topic, independent audits, the code is open-source and the data can be inspected. HumanWeb data is 100% record-unlikable, we have no way to know if two messages received come from the same person or not.

> This claim on the Wikipedia is factually incorrect: "This data is tied to a unique identifier allowing Cliqz to track long-term performance."

That claim comes directly from Mozilla's support page on the subject¹:

> Firefox shares the following data with Cliqz to provide functionality and improve performance of the Cliqz feature for everyone:

> - Search queries & webpage data: This includes text as you type in the address bar, queries you send to certain search engines, and data about the webpages you visit and interactions with those pages, such as mouse movements, scrolls, and time spent.

> - Interaction data: This includes your interactions with specific fields and buttons in the Cliqz feature. This data is tied to a unique identifier allowing Cliqz to understand performance over time.

So, if that's "factually incorrect", you should take it up with your business partners.

> There are plenty of papers on the topic, independent audits, the code is open-source and the data can be inspected. HumanWeb data is 100% record-unlikable, we have no way to know if two messages received come from the same person or not.

For now. Things can always change, and promises can always be broken. It'd be a lot easier to trust Cliqz if it wasn't collecting such data at all, let alone sending it to remote servers with a pinky promise that it's anonymized.

----

¹: https://support.mozilla.org/en-US/kb/cliqz-recommendations-f...

> we have no way to know if two messages received come from the same person or not.

This is accurate. They partnered with us at FoxyProxy to prevent browser telemetry from revealing users' IP addresses and other metadata.

These guys are above board and even if there may have been a problem in 2017 with Firefox, that was no longer the case in 2018, 2019, and 2020. They bent over backwards and jumped through many hoops to hide their users' identity. They were very interested in the solving the engineering problem around anonymization. I know this from first-hand experience.

This is a loss larger than many people realize. There are so few companies with such integrity and who put their users first, above profits or shareholders.

There were no problems in 2017 or before, we were doing the same exactly the same during Firefox times (we went through security and privacy audits). Data collection is and always was safe wrt to privacy.

Why the ruckus then? Because some assume that is data is sent, privacy is compromised, period. They do not know how to do it, and they assume it's impossible. Instead of checking the claims for themselves (code is public, data can be inspected, documentation, etc.) they prefer to stick to their belief system, which is more comfortable and does not imply hard work. The press release that FF -- written by one of these people with a lot of biases and published without review -- did not help as it was misleading.

We did a big mistake back then. Instead of rebutting it, we chose to ignore the FUD assuming that facts would prevail. They did not.

Sadly the community is "scared", we have been congratulated and lauded by anyone who checked our systems. But never endorsed in public, there is little to gain and a lot to lose (you are getting a sneak preview right now).

Sad story, extremely frustrating too, but there is nothing we can do now.

> Why the ruckus then? Because some assume that is data is sent, privacy is compromised, period. They do not know how to do it, and they assume it's impossible. Instead of checking the claims for themselves (code is public, data can be inspected, documentation, etc.) they prefer to stick to their belief system, which is more comfortable and does not imply hard work.

If my eyes rolled any harder I'd likely pull a muscle.

Let's dissect this a bit:

> Because some assume that is data is sent, privacy is compromised, period.

It ain't about it being sent (though that's bad, too). It's about it being collected at all. Cliqz collects and aggregates my data somewhere, and that is therefore a violation of my privacy, even if (for now) it's on my local machine (I could certainly routinely delete that collected data, much like I do with cache and cookies, but then what's the point of using Cliqz in the first place?).

> Instead of checking the claims for themselves (code is public, data can be inspected, documentation, etc.)

I have checked the claims for myself (to the best of my ability). None of them address the very real concern of the aggregated data being, you know, aggregated. Just because it's on my local machine doesn't mean it's guaranteed to stay that way; every second it's on my machine is a liability that anyone who's privacy-conscious would want to eliminate (and anyone who's not privacy-conscious doesn't care about).

Like, there's no argument that Cliqz's HumanWeb is at least less evil than traditional tracking systems, but it still relies on aggregation of data, and that is still a massive privacy hazard. Not to mention that the data that is sent¹ is still rich with datapoints that could be used for fingerprinting (the papers seem to suggest there are "heuristics" to detect and anonymize this, but said papers are pretty light on detail, and source code is meaningless since we don't know if it's what's actually running server-side). And also not to mention the rather sketchy distribution methods, like piggybacking on .NET downloads via chip.de in a manner that's been a hallmark of spyware since Y2K.

> they prefer to stick to their belief system, which is more comfortable and does not imply hard work.

"Am I out of touch? No, it is the children who are wrong."

----

> Sad story, extremely frustrating too, but there is nothing we can do now.

Not with that attitude. The search engine technology y'all developed is pretty interesting from a technical standpoint, and could be put to use (I'm sure DDG would be interested in adding it to their mix, or perhaps Ecosia could use it to diversify their Bing/Yahoo results the way DDG does with their in-house crawler). Same with Ghostery's more efficient network request blocking engine² (though it seems like Ghostery's development is still ongoing, no?), which could be useful in other ad and tracker blockers. Neither of these are much in the way of money-makers (well, maybe the search one is, if y'all license it), but it'll at least help make the best of a lousy situation.

I get that it sucks - I've similarly felt the pain of a product into which I've put my blood, sweat, and tears ultimately failing. It's easy to write off the detractors and critics as simply uninformed masses who just "didn't understand how great of a product we have". It's harder to admit that the product wasn't great, or the name was terrible, or the market wasn't as big as anticipated, or what have you.

I'm confident that being the bright and enthusiastic people y'all are, you'll find your footing again. Just, um, try to come up a name that doesn't scream "adware" like "Cliqz MyOffrz" next time, lol. And maybe instead of writing off your criticisms as "FUD", actually examine why those criticisms persist and what you can do to better address them.

----

¹: https://cliqz.com/en/whycliqz/transparency

²: https://whotracks.me/blog/adblockers_performance_study.html

In your experience, what was the biggest challenge to be on par with Google?

From using Cliqz, I felt that the relevance of search results was fairly good.

The challenge I had in many cases was that the coverage was so much less for non-common search terms that the information I was looking for just wasn't there.

The Instant Answers on Google are also getting good to the point where sometimes I don't have to leave the SERP.