| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by toumhi 5650 days ago

That's desired behavior, although much more difficult in practice than it is in theory. If 20% of your network goes down and you can still serve clients normally, it means that you have a big reserve of machines useful only in case in big outages. I don't know if you can justify it economically.

You can also gracefully degrade performance, by rejecting client connections, disconnecting progressively some clients, accepting loss of consistency etc. It depends how far you can go without infuriating your customers.

We discovered that large-scale real-time systems(in our case, currently 400.000 concurrent connections) are really hard to stabilize against presence storms, network problems and buggy clients, among others.

1 comments

jpablo 5650 days ago

If 20% of your network goes down and you can still serve clients normally, it means that you have a big reserve of machines useful only in case in big outages. I don't know if you can justify it economically.

Just spin more EC2 instances ?

link

michaelbuckbee 5650 days ago

That's an interesting thought: in case of outage Skype could switch from user supplied resources (Supernodes eating users bandwidth and processing) to emergency Skype hosted supernode services.

link

toumhi 5650 days ago

Yes, if you use an elastic cloud, by all means, spin more instances :-) Most existing companies still have real servers however.

link

bonzoesc 5650 days ago

Most existing companies don't run P2P voice chat networks, either. Using EC2 or some other elastic cloud for emergency supernodes makes a lot of sense, since they can outsource the risk of those machines sitting idle to Amazon.

link

TrevorJ 5650 days ago

The "cloud" is still made up of real servers ^.^

link