Hacker News new | ask | show | jobs
by ay 5079 days ago
ok, so I grpk the idea correctly it is something like "send the cookie-like-data from the client only on the first GET, if you are doing it over HTTP/1.1 single TCP connection" - that sense (and could be easily made into an extension to HTTP/1.1 - [though it creates the dependency between the different GET requests] - have the server will just send "X-Dont-Send-Me-More-Cookies-in-this-TCP: yes!" header from the server, and make the compliant clients react to it).

What I do not understand where's the win on the privacy front here. You send the random ids - but the site owner will re-correlate these random IDs with your identity. So, you would not win anything here - or, what am I missing ?

My take on the privacy:

There is no problem with someone collecting a bunch of info about me and using it to improve their services.

There is a little bit of a problem with someone collecting a bunch of info about me and another million people and keeping that in a big blob.

There is a big problem when that someone gets hacked and this bunch of info about another million people gets to the bad kids.

It's the centralization of a lot of data that is bad for the privacy.

Store the data locally on the clients and give it to the server only when it is contextually needed. e.g.: my shipping address, I am happy for my browser to supply it to you from my local storage to you every time you want to ship me something. I am very happy if you do not store and sell this address to someone who will later send snail-mail spam to me. Or store without the due diligence ('cos time to market and all that) and then get hacked and then I find myself "having paid" for the helicopter spare parts.

Of course, this would hurt the nouveau business models that treat the users as a product. And will make the analytics harder - because one would not be able to just run a select... But to me it could be a useful tradeoff.

(above, I use the term "client" to refer to the collective set of the devices that are "mine". As I wrote in another reply, storing the state on client does not imply the difference in the user-seen behavior, so the shopping cart should survive).

1 comments

Of course, keeping the data decentralized on your computer is super secure, this is why botnets logging users data never got beyond theory. It is also why phishing was a clever idea but never panned out, people only would send data to the right recipients. </snark>

Sure, centralized data sounds big and scary, because a single security instance looses a million people's data in one go, but how is it any different from a million security instances in a virus losing "only" 1 person's data?

Similarly, I don't understand how it is remotely feasible to think that storing your shipping adress on your computer vs on a site that is shipping you stuff changes things -- I mean, they still have to get your address to send you the stuff you ordered. It is a fundamental requirement of shipping. Address is not a private bit of info.

Fingerprinting will be around, so it is probable that there will still be tracking. Can't beat that right now, so lets not conflate that with other problems. Instead lets look at the problems that are solved: cookies store data to make it easy to not just correlate and be probably right about the user, but be perfect. Further, they can be hijacked and otherwise stolen and used by malicious third parties, giving data beyond just the access patterns to the site in question. Session ids can be engineered to not have this inherent problem, cutting down information leakage. Further, I imagine plugins that will keep drack of your worst data offenders, and force a new session id every request from them, making the data tracking and correlation even more difficult.

It isn't an all or nothing game, even if you get rid of the low-hanging-fruit abuses, it is a win. Yes, new stuff will come along, but that doesn't mean we shouldn't try, particularly when the current scenarios allow all the bad stuff you can think of, but easier.

re. snark: phishing: it is not the physical user that has to input the data. Think of how you use the password manager. botnets: yes, but since I keep my computing devices clean, I was never a victim of a botnet. While my account info was stolen from one of the online sites, with zero influence. See where the difference is ?

The difference is that the decentralized approach would put more control in the hands of the user (so they either take care themselves or hire someone to take care for them). If they want to.

"Address is not a private bit of info" - it's person and context dependent. Some people consider their name a private bit of info in some contexts... And yes you have to send the shipping info to the remote party to ship you stuff. But they do not have to keep it neatly packed one select away.

I still have a difficulty understanding how the "random session-id" will solve the problem of privacy. All I can see happening is one more level of indirection, that will cause the creation of the frameworks to re-collate this back. Because this is a functionality that is needed by the developers. And once you have the commonly available code, you're back to previous stage - except with an additional pile of code to debug.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

edit: re. sending the data to the trusted server: sign with your client key a "request for data" together with the manifest of the addresses that the server can plausibly have. Then when the server needs the data it can present this request to your UA and get the data. Yes, the server can be hacked and this data can be siphoned off. But then the attackers get the [timespan of the breach] worth of user data, and not the entire DB.

re. snark: phishing: it is not the physical user that has to input the data. Think of how you use the password manager. botnets: yes, but since I keep my computing devices clean, I was never a victim of a botnet. While my account info was stolen from one of the online sites, with zero influence. See where the difference is ?

No I don't see the difference at all. So you got lucky, and didn't have you computer targetted early on by a 0-day virus. Congrats, I'm sure your luck will keep up forever.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

Now you are conflating the sole benefit of session ids with the security benefit. There are other benefits. Read the article, there are benefits to "http routers" that would come from it. Look at my comment history, I mention a couple (cache locality benefits from routing, ability to standardize login stuff and use http auth reasonably again, without reinventing the wheel every site/framework). Others have mentioned other benefits. The incremental security benefit is but one of these.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

This is a strawman, yes there are still places on legacy systems, but more and more are adopting systems that allow standards based approaches and faster upgrade cycles (ala adopting chrome or firefox), there is no reason to doubt this trend will continue.

edit: re. sending the data to the trusted server: sign with your client key a "request for data" together with the manifest of the addresses that the server can plausibly have. Then when the server needs the data it can present this request to your UA and get the data. Yes, the server can be hacked and this data can be siphoned off. But then the attackers get the [timespan of the breach] worth of user data, and not the entire DB.

This looks to be a usability nightmare. Further, at best it is no better of a solution than the one i presented - an incremental change that requires lots of code. As soon as this starts happening in a widespread way, the attack patterns will change from server hacking to browser hacking in a serious way. Or finding ways to hack the http gateways where ssl is dropped, and which are frequently appliances harder to monitor for security. Or there will be more phishing attacks using sophisticated key stealing techniques to get real credentials. Or DNS attacks. Or as plug devices get super cheap, piles of mitm attacks on places with wifi, or or or... security is always incremental.