Hacker News new | ask | show | jobs
by bscphil 1290 days ago
I vouched for this comment because it's a good question, although I'm worried that your account is not long for this world with an inflammatory username like that.

The problem probably starts with the inefficiency of RoR, as you've guessed. Mastodon is a very dynamic site which limits the amount of caching that can be done, and there are hot code paths like filtering streams using a user's block lists and word filters that are not particularly optimized - all this happens in Ruby.

But there are other inefficiencies, compared to SO:

1. Mastodon is a media heavy site, with a lot of uploading by users. Mastodon has to convert user-uploaded media to standardized representations (e.g. JPEG and h.264), which takes a lot of CPU time.

2. Mastodon has a "firehose" feed which is available in the UI and actually used by many users. Filters apply to the firehose feed as well. Obviously this requires quite a lot of bandwidth and processing.

3. Federation is a weakness when it comes to traffic. If user X has an account on server A, and at least one user on 1000 other instances follow user X, server A has to immediately send any posts to all 1000 other instances, regardless of whether anyone on the other end will ever deliberately view them. (Of course, some users may view them in their instance's firehose feed.) The instance then has to duplicate this traffic when sending it to the actual subscribed users. By this standard both large non-federated "servers" (like Twitter) and widely federated pull-only servers (think RSS) are more efficient than ActivityPub (the open standard Mastodon uses).

4. Federation is a weakness when it comes to trust. Instances do not (and must not) fully trust each other, except for things like "@x@thisinstance said 'P'". So for example, the little Open Graph based preview cards you're used to seeing on Twitter and elsewhere have to be generated for links per instance. The first time a Mastodon server sees a link, it must fetch that link and generate a preview card itself. Because new posts by popular accounts are syndicated immediately, this is a burden on websites as well. https://www.jwz.org/blog/2022/11/mastodon-stampede/ (note: copy link or disable sending referrers from HN for this site)

5. Scaling is not really a solved problem yet for Mastodon, because in practice it hasn't had to be. It's easy to pass the buck to instance operators, who end up needing a $20/month VPS to run a small instance rather than $5/month. Even the very biggest servers are scarcely larger than 1M users. At that kind of scale you can patch over performance problems by just throwing more hardware at the problem - and e.g. mastodon.social has the funds from Mastodon (the org) to do that. Note that Hachyderm, AFAIK, is an obvious example of this; it was started by a tech worker in Seattle with much better access to expensive hardware than most casual instance operators can dream of. It's not surprising that they can pull the funds together to scale up before they start seeing performance issues.

1 comments

In practice #3 is the only one that matters. For reducing dynamism/increasing caching potential, it would be fairly easy to run a fork of the site with the more dynamic features excised (donate $1 a month to get access to dynamic features like filtering). For media transcoding, that's a textbook case of a CPU-bound operation that you could offload to an isolated Rust component for a CPU savings of 99% compared to Ruby (not an exaggeration). But the exponential nature of the network scaling will still kill you despite all this, and needs to be addressed at the protocol level ASAP.
The media transcoding components aren't written in Ruby, it shells out to ffmpeg or Imagemagick.