Hacker News new | ask | show | jobs
by jayd16 1592 days ago
>backend out-scales all other multiplayer servers

Can you explain what you mean here? What was your peak active user count, what was peak per server instance, and why you think that beats anything else?

1 comments

Agreed, I'm curious as well. We load tested with real-clients faux-users, up to 1 million concurrent. And only stopped at 1 million because the test was becoming cost prohibitive.
The data is here: http://fuse.rupy.se/about.html

Under Performance. Per watt the fuse/rupy platform completely crushes all competition for real-time action MMOs because of 2 reasons:

- Event driven protocol design, averages at about 4 messages/player/second (means you cannot do spraying or headshots f.ex. which is another feature in my game design opinion).

- Java's memory model with atomic concurrency parallelism over shared memory which needs a VM and GC to work (C++ copied that memory model in C++11, but it failed completely because they lack both VM and GC, but that model is still to this day the one C++ uses), you can read more about this here: https://github.com/tinspin/rupy/wiki

These keep the internal latency of the server below maybe 100 microseconds at saturation, which no C++ server can handle even remotely, unless they copy Java's memory model and add a VM + GC so that all cores can work on the same memory at the same time without locking!

You can argue those points are bad arguments, but if you look at performance per watt with some consideration for developer friendlyness, I'm pretty sure in 100 years we will still be coding minimalist JavaSE (or some copy without Oracle) on the server and vanilla C (compiled with C++ compiler gcc/cl.exe) on the client to avoid cache misses.

Energy is everything!

> - Java's memory model with atomic concurrency parallelism over shared memory which needs a VM and GC to work

Do you have a link that explains this bit?

Not other than the one linked in the comment above. I have been reaching out to EVERYONE, and nobody can explain this to me, but I'll implement it myself soon so I can explain it.
The links upthread don't actually explain why a VM + GC can do shared-memory concurrency faster[1].

I don't understand what particular piece of magic makes shared-memory concurrency under a VM+GC faster than a CAS implementation.

[1] I'm assuming a shared-memory threaded model of concurrency, not a shared-nothing message passing model of concurrency.

CAS?

Me neither, but I know it does in practice.

My intuition tells me the VM provides a layer decoupled from the hardware memory model so that there is less "friction" and the GC is required to reclaim shared memory that C++ would need to "stop the world" to reclaim anyhow! (all concurrent C++ objects leaks memory, see TBB concurrent_hash_map f.ex.) That means the code executes slower BUT the atomics can work better.

As I said; for 5 years I have been searching for answers from EVERYONE on the planet and nobody can answer. My guess is that this is so complicated, only a handfull can even begin to grook it, so nobody wants to explain it because it creates alot of wasted time.

The usual reaction is: Java is written in C, so how can Java be faster than C? Well I don't know how but I know it's true because I use it!

So my answer today is: Java is faster than C if you want to share memory between threads directly efficiently because you need a VM with GC to make the Java memory model (which everyone has copied so I guess it must be good?) work!

Here is someone who knows his concurrency and made C++ maps that might be better than TBB btw: https://github.com/preshing/junction

But no guarantees... you never get those with C/C++, I stopped downloading C/C++ code from the internet unless it has 100+ proved users! So stb/ttf and kuba/zip are my only dependencies.

This is fantastic information.