Hacker News new | ask | show | jobs
by jwalton 774 days ago
> A link preview with an image is over 100 MB?

I think what’s being implied here is that, when you share a link to Facebook, Facebook will access the page to generate a link preview, so will download a tiny bit of HTML and an image. But when you share a link on mastodon, that link immediately gets propagated to many other mastodon servers, which then propagate it to others, so suddenly many thousands of mastodon instances are simultaneously downloading a little bit of HTML and an image, and the cumulative effect of that in this instance was 100MB over a minute or two.

It does seem like a typical static website ought to not have a problem serving that, especially if it’s behind Cloudflare. It seems odd that a single EC2 instance would have a hard time serving that.

But given more than one person is complaining about, it also seems like each mastodon instance could very easily delay propagation of the story by a few minutes to soften the blow here.

1 comments

each mastodon instance could very easily delay propagation of the story by a few minutes to soften the blow

I liked that idea at first glance, but thinking about it, CDN performance would actually be better with a single huge burst than if they were smeared out (assuming a very short max-age so the site can be updated rapidly).

It's actually more complicated. With HTTP you don't know requests can be coalesced until you receive the response headers with the Cache-Control and Vary. So if your website takes a few seconds to respond most CDNs will send every single request in that period through.

In theory a CDN could optimistically coalesce requests then re-send them when the headers of the first one return. But this is very complex and rarely done in practice.

This can also occur on any time the cache gets stale and needs to be refetched.

> most CDNs will send every single request in that period through.

I don't think this is true. It certainly isn't for any CDN that I've worked for or on.

Cloudflare don't do this either - they use a cache lock - the first request basically acts as a blocker for all the others, leaving the other requests waiting for the response (if it's cacheable they serve that response, if not then they proceed to origin).

It's normally configurable, but most sane CDNs do have it enabled by default, precisely because big bursts tend to be sharp in nature and a cache miss can be origin breaking at that point.

Just for completeness's sake, Nginx's HTTP proxy module can do it too (the setting's proxy_cache_lock) though it is off by default there.