Hacker News new | ask | show | jobs
by phkamp 5080 days ago
No, servers should hold _all the state_ and clients none. If I add something to my shopping basket from my mobile phone, I want to be able to add more from my browser, rather than have two shopping baskets.

Servers storing stuff on the clients is just plain wrong, and it is wrong from every single angle you can view it: Its wrong from a privacy point of view, it's wrong from a cost-allocation point of view, it's wrong from an architecture point of view and it's wrong from a protocol point of view.

But it was a quick hack to add to HTTP in a hurry back in the dotcom days.

It must die now.

5 comments

I too want to add the first item into the shopping basket from the laptop and the second from my tablet.

But this does not imply that the clients should store no state - it only implies that the state as perceived by the users needs to be the same. Different from how it is implemented.

While we are at the topic of state, why do I have to subscribe to a service just to be able to add the bookmark on one device and use it on the other ?

I view the two problems as congruent (except the bookmarks state is global, thus there is no "server" to offload the state onto) - but at the same time this difference highlights the assumption that there is "The Server" for the web app. What if there weren't ? Can we push the model a bit further and make it p2p - and I am pretty sure that as the homomorphic crypto advances, we will be able to do so even for the untrusted peers. Then there's no "server" anymore to store the state in.

Then, you have the DoS bit. Absolutely correctly the HTTP routers are the most loaded and hard to scale element of the whole setup. If you offload the state on the client, then you can "dumb down" the task of the non-initial content switching decision, based on the trustable client state.

So, I think that distributing the state is a good idea. What is limiting is the naive distributing the state - and this is where I agree with your assessment. And that's probably one of the things that would need to get fixed for something that would is big enough to be called "2.0". (As a by-product, solving the above would also solve the endpoint identity/address change survivability problem).

No, servers should hold _all the state_ and clients none.

That's the opposite of the general consensus in the webdev-community.

Client-state is not only vastly more efficient in many cases but it also usually leads to cleaner designs and easier scaling.

Many of the modern desktop-like webapps would be outright infeasible without client-state. What's your response to that, should we just refrain from making such apps in the browser?

If I add something to my shopping basket from my mobile phone, I want to be able to add more from my browser

And at the same time you probably appreciate when on your slow mobile-link the "add-basket" operation happens asynchronously, yet doesn't get lost when you refresh the page at the wrong moment.

I'm a bit confused here. You know better than most how critical latency is to the user-experience. Saving on server-roundtrips or hiding them is a big deal.

Yet you promote this dogma without providing an alternative solution to this dilemma.

Cleaner for who ? Easier for who ?

That kind of cookie usage just makes it Somebody Else's Problem instead of your problem.

Cleaner for who ? Easier for who ?

For the webapp-developer, which results in a faster and cheaper experience for the user.

I'm still baffled at your persistence given you sit pretty much at the source. You have probably written VCLs for sticky sessions yourself and pondered the constraints wrt data-locality and single points of failure? Sticky sessions are just not a good design when the alternative is so readily available; it's the first time in a long time I hear anyone disagree with that.

That kind of cookie usage just makes it Somebody Else's Problem instead of your problem.

And who would that "somebody else" be?

Users certainly don't care about a few hundred extra-bytes that their browser sends with each request, especially since that trade-off usually makes their browsing faster than the alternative would be.

The privacy concern is valid but boils down to developers using cookies wrong (without encryption). If we were to remove all technologies that are used wrong by incompetent developers then the internet would be a pretty empty place.

Clients can't be completely stateless; at the very least they need to pass along a key to identify their server-side state. That's what cookies do now (among other things) and it sounded to me like that's what you were proposing for the session/identify facility. I agree with you on that point; a specific feature in the protocol would be better than the generic cookie feature, given the ways cookies have been abused.

What's your opinion on IndexedDB and other local storage mechanisms? I believe that single-page-apps are overused, but I do think that they have their niche and standards for storing data locally are valuable and necessary. In my own work I'd use that space as a cache rather than permanent storage, just like I'd use something like memcached on the server side to reduce database queries.

But what about preferences for anonymous users? Store that on the server side? Append them to the URL? Both options kinda suck.

Also, consider dabblet. The way it allows you to store your stuff using github is very smart IMHO.

Store it on the server: The user-agent gives you a session-id to use as key.

It may be that session-keys should tell if they are anonymous or if they represent (locally) authenticated users, but that's a very complex subject I won't claim to have a clear opinion of yet.

Store the settings of anyone who ever connected? For how long? Forever, just in case? Silly. And why do you even assume the server has to have a database? Why should it be required to have one, why should it have to store the stuff? What is your take on statelessness? you concentrate so much on the abuses of cookies and client side storage/computation, but you're not addressing the advantages. I doubt you're aware of them to be honest.
Uhm, isn't that how it works today ? Do you care about how many metric shitloads of storage your cookies take up on client's disks ? Shouldn't you ?

Putting the cost of storage where the decision to store is made is sound economic practics.

My Cookies directory is 11 MB. That's actually quite a lot, considering the length of the average cookie, but my disk is 256 GB and it's only gotten that big because I've been browsing for years literally without ever clearing my cookies and I can clear them at any time.

This is really a non-issue.

It is the user's session data. If it is stored on their end, they can choose how long they wish to store it for, and delete it any time they like.
User-agent, and other bits of stuff that is duly noted by http://panopticlick.eff.org/ are much-much-much worse than cookies. Cookies you can erase. User-agent and other "fingerprints" are with you forever. And they travel with you no matter where you are.

So, while you would dismiss the "privacy hazard" that the cookies are, you replace it with something much worse.

You can still have the cookie concept, and have the session id be a random number each time someone sends a tab to the site. The cookie can hold those preferences, and the session id can be used for session stuff. As a bonus, you can then only load the cookie on the first page load, and keep the values in cache associated with the browser random session number, saving in data transfer issues, and losing nothing. And for those that don't need cookies, they get a big win in terms of privacy.
ok, so I grpk the idea correctly it is something like "send the cookie-like-data from the client only on the first GET, if you are doing it over HTTP/1.1 single TCP connection" - that sense (and could be easily made into an extension to HTTP/1.1 - [though it creates the dependency between the different GET requests] - have the server will just send "X-Dont-Send-Me-More-Cookies-in-this-TCP: yes!" header from the server, and make the compliant clients react to it).

What I do not understand where's the win on the privacy front here. You send the random ids - but the site owner will re-correlate these random IDs with your identity. So, you would not win anything here - or, what am I missing ?

My take on the privacy:

There is no problem with someone collecting a bunch of info about me and using it to improve their services.

There is a little bit of a problem with someone collecting a bunch of info about me and another million people and keeping that in a big blob.

There is a big problem when that someone gets hacked and this bunch of info about another million people gets to the bad kids.

It's the centralization of a lot of data that is bad for the privacy.

Store the data locally on the clients and give it to the server only when it is contextually needed. e.g.: my shipping address, I am happy for my browser to supply it to you from my local storage to you every time you want to ship me something. I am very happy if you do not store and sell this address to someone who will later send snail-mail spam to me. Or store without the due diligence ('cos time to market and all that) and then get hacked and then I find myself "having paid" for the helicopter spare parts.

Of course, this would hurt the nouveau business models that treat the users as a product. And will make the analytics harder - because one would not be able to just run a select... But to me it could be a useful tradeoff.

(above, I use the term "client" to refer to the collective set of the devices that are "mine". As I wrote in another reply, storing the state on client does not imply the difference in the user-seen behavior, so the shopping cart should survive).

Of course, keeping the data decentralized on your computer is super secure, this is why botnets logging users data never got beyond theory. It is also why phishing was a clever idea but never panned out, people only would send data to the right recipients. </snark>

Sure, centralized data sounds big and scary, because a single security instance looses a million people's data in one go, but how is it any different from a million security instances in a virus losing "only" 1 person's data?

Similarly, I don't understand how it is remotely feasible to think that storing your shipping adress on your computer vs on a site that is shipping you stuff changes things -- I mean, they still have to get your address to send you the stuff you ordered. It is a fundamental requirement of shipping. Address is not a private bit of info.

Fingerprinting will be around, so it is probable that there will still be tracking. Can't beat that right now, so lets not conflate that with other problems. Instead lets look at the problems that are solved: cookies store data to make it easy to not just correlate and be probably right about the user, but be perfect. Further, they can be hijacked and otherwise stolen and used by malicious third parties, giving data beyond just the access patterns to the site in question. Session ids can be engineered to not have this inherent problem, cutting down information leakage. Further, I imagine plugins that will keep drack of your worst data offenders, and force a new session id every request from them, making the data tracking and correlation even more difficult.

It isn't an all or nothing game, even if you get rid of the low-hanging-fruit abuses, it is a win. Yes, new stuff will come along, but that doesn't mean we shouldn't try, particularly when the current scenarios allow all the bad stuff you can think of, but easier.

>Servers storing stuff on the clients is just plain wrong, and it is wrong from every single angle you can view it

Not that it is surprising given the source, but this "my opinion is objectively correct" nonsense isn't constructive. Client side sessions give you stateless servers, which allows real seamless fail-over. Having to run a HA session-storage service to get that is a big additional cost. "PHK said it is right" doesn't provide sufficient benefits to overcome that downside.

They are not wrong because I say so, they are wrong because they are wrong.

When it gets to the point where EU regulates something, the way they did with cookies, it should be painfully obvious to even the most casual observer, that there is something horribly wrong with it.

As for the cost of your HA session-storage ? Cry me a river! You're the one making the money, you're the one who should carry the cost.

The EU did not start regulating cookies because your data could theoretically be leaked from your computer via cookies or something like that. They did it because cookies are used to track people, which is no different from a hypothetical session identifier, except that hypothetical browser controls could be added which are already fully possible with cookies.
"Cry me a river" is no more compelling than "I am right because I say so". I am making money? I didn't realize my free site that I pay hosting expenses for out of my pocket was making me money. When can I expect my check?

You haven't offered any reason why anyone would want to move from client side sessions to server side sessions. If you want to affect change, you need to provide reason for change, not just condescending nonsense.