Hacker News new | ask | show | jobs
by danudey 1097 days ago
Text and links, but then also recommendation algorithms, spam filtering, image hosting, lightning-fast caching systems so that your main feed doesn't take hours to load fetching row by row from MySQL, geographically distributed data centres for redundancy and locality.

Then there's the ad engine, which requires user data harvesting, and all of that analytics and analysis and machine learning, and that all gets expensive, so you have to do more of it and do it better so that you can make more ad dollars so you can afford to do more of it and do it better so that.... and so on and so on.

You could just charge users, of course, but if you charge users then you hamstring your organic growth, so you have to find a way to only charge some users but charge them significantly more. Even if Twitter only costs $1/usr/mo to run, how many of those users will pay? You need to charge 1% of users $100/mo, or 0.1% of users $1000/mo, which means you need to offer them something tangible for their money, but none of these sites can really think of anything tangible to offer their users that's worth paying for so they're stuck with ad revenue and...

Yeah, it's a gigantic mess.

1 comments

If you're a non-profit then you don't need a recommendation engine or an ad engine. You also don't need to self host video and images, at least initially. Reddit, for the longest time, relied on Imgur and embedded video players until they built their own infrastructure. Also, thanks to advances in AI, there is an opportunity for AI moderators for content curation. As far as caching and web scale in general, you don't need a full server farm initially. Even Google started with a single server rack.
Google's 1st server was housed in a cabinet of Lego bricks. No server rack.

https://scx2.b-cdn.net/gfx/news/2011/googlegrewfr.jpg

Google's first server farm consisted on row after row of regular desktop machines on the floor. Also no server racks.

But sooner or later, you're going to need one of these: https://www.google.ca/about/datacenters/

The thing I would live in constant fear of if I were to host a Mastodon server is that sooner or later, one of your users' posts is going to end up being linked to in a Wired feature article, or a New York Times article, or goes viral for any of a billion reason, and your server is going to end up getting hammered with 10,000 requests a second for a week or a month, which means you're going to be facing a sudden unexpected $3,000 AWS bill. (Not sure what it would actually cost. Anyone?)

I've seen what being linked to in a Wired article does to a web server. It's ugly.