Hacker News new | ask | show | jobs
by jordo 1750 days ago
This is EXACTLY the issue we are currently running into while hosting our gameservers on kubernetes via agones https://agones.dev/site/. While our gameserver processes don't really need a full core, they do need to be scheduled regularly to meet their tick simulation deadlines. A gameserver that ticks at 60hz absolutely needs CPU time at the very least every few milliseconds. Not getting scheduled back in time for a game tick means more input latency, and wreaks havoc on game clients trying to adjust their prediction buffers to optimally time their input receipt server side before the next server tick.

Bad, bad, bad scenario all around. A gameserver process really needs to get scheduled like a heartbeat. Ticks can't be late on the server as it really screws everything else up big time. We've seen these "bursting" effects really start to take it's toll, to the effect that we're now just setting a request and limit of 1CPU for each game server process (even though we could pack a lot more punch here).

2 comments

If you're using Linux, you could try the deadline scheduler policy (SCHED_DEADLINE, see sched(7) [1]), especially if the tick runtime can be estimated to an extent. Essentially, you set the task period (60 Hz in your case), the task runtime (a rough upper bound for the task execution time) and a deadline, and the Linux scheduler will make sure your task runs at an appropriate time every period. It will also throttle tasks that exceed their budget (by running over their allotted runtime) and it won't admit tasks if it's impossible to run them (since the sum of task runtimes exceeds the available runtime).

[1] https://www.man7.org/linux/man-pages/man7/sched.7.html

>While our gameserver processes don't really need a full core, they do need to be scheduled regularly to meet their tick simulation deadlines.

I think you need a core. If you don’t have one you’ll never know when the scheduler runs the process. The core scheduler feature seems like a good way to now do this.

We used to run hpux in “semi real time mode” for processes on tight timelines. The OS would let us put one process on the core that had interrupts disabled (only certain system calls were allowed). It worked well. Of course if the process had an issue the machine required a reboot.

Hpux also had processor sets which would allow us to tell the os to run certain processes on certain cores. This seems similar to the “core cookie” but I think that seems to have more to do with resource control.

This is really good progress.