I have a memory of some chat site, maybe discord, sending attachments to a different server, thus exchanging the bandwidth problem with extra system complexity
That's how I'd solve the problem. The added complexity isn't even that high, give the application an endpoint to push an attachment into a distributed object store of your choice, submit a message with a reference to the object and persist it the moment the chat message was sent. This could be done with mere bytes for the message itself and some very dumb anycast-to-s3 services in different data centers.
I'm sure I'm skipping over tons of complexity here (HTTP keepalives binding clients to a single attachment host for example) because I'm no chat app developer, but the theoretical complexity is still relatively low.
I'm sure I'm skipping over tons of complexity here (HTTP keepalives binding clients to a single attachment host for example) because I'm no chat app developer, but the theoretical complexity is still relatively low.