Hacker News new | ask | show | jobs
by jd007 3235 days ago
This post comes at an interesting timing for us. Our product has a real time chat component, and currently is done over XMPP. We implemented using custom components on top of Tigase, a Java based XMPP server.

Due to a variety of reasons our chat services overall have fallen out of shape and we are in the process of considering a full re-write. Part of the consideration is ditching XMPP altogether. It seemed to me that for products where chat is only one of the features, XMPP feels like an awkward add-on, requiring its own set of IDs and auth protocols.

AFAIK many popular chat services these days are no longer based on XMPP, for one reason or another (e.g. scalability). Would like to hear any recommendations on what the best ways to do real time chat services nowadays.

2 comments

If you're making a service that's primarily chat, just use XMPP (disclaimer: I used to work for a large chat system based on XMPP and got involved with the protocol development when I did so). It scales very well (if it worked for Google Talk and HipChat…), but it does require that you understand the protocol. Network protocols have tons of subtle edge cases that will cause problems; inventing your own is just a bad idea, even if the existing ones require a bit more work to learn and get going with.
Did you guys write your own implementation or use an existing server?
We wrote our own server based on Python/Twisted (which has an XMPP implementation in Twisted Words). The Twisted implementation of XMPP is a bit outdated now, but it wasn't too terrible to work with. There may be more up to date libraries; in Python land aioxmpp seems nice.
Thanks for the info!
The main alternative is probably Matrix currently (Disclaimer: i work on it). The advantages are that it's a pretty simple HTTP API, and there are good SDKs for JS, iOS & Android. The biggest disadvantage is probably that the UI SDKs (matrix-react-sdk, matrix-ios-kit and matrix-android-sdk) have ended up quite entangled in the Riot client codebase, and we need to decouple them so that they can be better used to implement custom chat/collaboration components.
Last I checked, the Synapse homeserver was heavy on resource consumption and therefore tough to run at home on, say, a RPi. How much progress has been made towards more usable homeservers?
I have a Synapse running (among other things) on a 1/1 VM. It's at the top of `top` (heh) when sorting by memory:

  PID USER      PR  NI    VIRT    RES %CPU %MEM     TIME+ S COMMAND                                                                                                                                                 
  486 synapse   20   0  510,1m  81,7m  0,0  8,2   3:03.73 S python2.7
  ...
There are six user accounts on that Synapse and few activity (less than 100 messages per day).

When I first installed it to play around with it, I joined #matrix:matrix.org and it brought Synapse to a grinding halt. Took quite some time to catch up with the thousands of peers it suddenly had to federate with, and I ultimately cleared Synapse's database to make it forget about all these peers.

For comparison, I also have a Prosody (i.e. an XMPP server) on the same box, which has about 10 user accounts and my own account is joined to a few moderately busy MUCs, so the number of messages is at least one order of magnitude higher. Yet memory usage is about one order of magnitude smaller (CPU usage is too infrequent to compare, but look at the "TIME" column for an initial comparison):

  $ psgrep 'prosody\|synapse'
  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
  synapse    486  0.1  8.1 522332 83664 ?        Ssl  Aug13   3:04 /usr/bin/python2.7 -m synapse.app.homeserver --config-path=/etc/synapse/homeserver.yaml --log-config=/etc/synapse/log_config.yaml
  prosody    574  0.0  1.0  52408 10476 ?        S    Aug13   0:27 lua5.1 /usr/lib/prosody/../../bin/prosody
Synapse does a lot more than Prosody though. If a Prosody server hosting a MUC goes down, the MUC is down. If the Matrix homeserver that created a room goes down, the room's still there as long as there are other servers participating. That decentralization does have its tradeoffs.

It's also the first Matrix server, so there's that. Its replacement is in progress: https://github.com/matrix-org/dendrite

Oh nice. That should have a lot less footprint indeed, both in terms of CPU usage and # of dependencies. (Not entirely sure about memory usage, but that should also be better for a compiled language.)
yup, synapse is heavy (as the rest of the thread says). this is mainly due to the DB schema being a bit naive, and it caching everything in RAM to speed things up. There are also some operations it does which spike RAM usage (which is then never reclaimed, thanks to Python2's malloc being a bit dumb). As others have said, Synapse is doing a lot more than something like Prosody. Dendrite on the other hand should be good for running on an RPi - we should have an idea this week, where the first monolithic Dendrite binary is due to land.