In an era of unlimited scaling infrastructure, it's a shame they're struggling to capitalize on an exclusive superbowl scale marketing event. It was predictable days in advance.
Days is not enough advance notice, if for example they don't have a way to monetize the traffic. Unlimited scale = unlimited cost. Potentially the site crashing was the financially appropriate outcome.
Will people coming in from Twitter and co specifically to check in on Pelosi's plane have any interest in converting to a subscriber for general flight tracking? I'd suspect the conversion rates would be (/are) extremely low with this kind of inorganic traffic.
Days in advance would have been enough to prepare a static page for SPAR19 alone, updated minutely or something... would cost practically nothing, especially if served from CDN edges.
I would be surprised if that's not already the architecture of a website like this, especially if keeping infrastructure costs low is a significant requirement for the business.
It's not unlimited scale, it was just over 700_000 people according to https://news.ycombinator.com/item?id=32320697. If your web page is 100 kilobytes, which seems like the maximum that would be reasonable, that's 70 gigabytes of traffic. AWS charges US$0.09 per GB so this is US$7. If you eliminate the cloud premium it's closer to US$1.
In fact loading their home page takes 3.1 MB for me over two minutes. Searching for a flight (AAL301 in this case since SPAR19 has already landed) brings this to 3.8 MB. At this size it would be 266 gigabytes, US$27. Reloading the page, I see that this was about 180 HTTP hits, though a lot of those were ads (I guess they have a way to monetize the traffic) which were blocked by uBlock Origin.
But doesn't it take a lot of CPU and RAM to serve 700_000 page views, especially at 180 hits per page view? That's 126 million hits, after all! Well, of course you can write your code arbitrarily inefficiently. According to https://crozdesk.com/software/fastly/pricing Fastly charges US$0.0075 per ten thousand hits, so if you could serve all those hits from Fastly it would cost you just under US$100. And probably if you're getting 700_000 people looking at the same thing you should figure out how to make all those hits cacheable either in Fastly or in something slower but cheaper. This probably isn't the first popular flight on FlightRadar24, even if it's the first one that's this popular.
(Also though you probably don't need 180 hits to serve up a single page. One for HTML, one for JS, one for CSS, one for an icon sprite sheet, and maybe half a dozen map tiles. The cause of death was a self-inflicted wound.)
What if we want to know the minimal CPU cost to serve up 126 million hits rather than the minimal dollar cost for someone else to serve them up for you? Well, one weekend a few years ago I wrote a static file HTTP web server called httpdito-386: http://canonical.org/~kragen/sw/dev3/server.s (docs in http://canonical.org/~kragen/sw/dev3/httpdito-readme). It's 710 lines of code and can handle 20_000-30_000 hits per second on my ten-year-old laptop (8 cores) and push about 1.8 gigabits per second of traffic. It's not the most efficient web server (it forks a new process for every connection and drops the connection after handling the first request) but it's probably adequate to get a ballpark figure.
Serving up 126 million hits with httpdito would take 84 minutes on my ten-year-old laptop, so probably you'd have needed 2-5 server machines, or one machine that wasn't ten years old. Serving up 70 gigabytes in a smaller number of hits would have taken 5 minutes.
Of course, the whole point of FlightRadar24 is that it's giving you dynamically updated Comet information about where flights are, not just serving up precomputed files from the filesystem. You could implement this kind of functionality by polling, but using Comet would probably be more efficient. Maintaining 700_000 open connections is easily within the capacity of a single server today; we were doing several thousand on our Comet server at KnowNow in 02000, using what we called RUTH (Robert's Ugly Thttpd Hack), using select() on a 32-bit machine with a gigabyte of RAM and a gigahertz.
https://news.ycombinator.com/item?id=32319147 says, "Use one big server." The associated article https://specbranch.com/posts/one-big-server/ profiles the servers they use at Azure: two 64-core CPUs with a 2-2.5 GHz clock, 4-6 instructions per clock, 256 MiB (MB?) of L3 cache, and 1 TiB (TB?) of RAM. From the cloud pricing they're citing, buying one probably costs about US$15k, roughly the cost of one programmer-week. According to https://news.ycombinator.com/item?id=32321406, FlightRadar24's revenue in 02021 was US$25M, so this would be a little less than 6 hours of their revenue.
Serving the website html, js, css, images, etc is only a small part of their overall hosting costs. Do they maintain backend services? API gateway? Waf? Logging? Analytics? State/database? Cache? Load balancers? etc etc.
Your walltext paints an incredibly incomplete picture of their overall hosting expenditure. They aren't running a wordpress site lol.
Well, as I said, you can complicate things to an arbitrary extent and make them arbitrarily inefficient, and they clearly have done so because their site went down under the load of less than a million pageviews. But the essential part of the service is delivering some HTML, JS, CSS, and images, and updating the client webpages with new flight status information, and that doesn't require 3.1 megabytes, 180 hits, a WAF, an API gateway, etc.
There's no reason FlightRadar24 has to require as much horsepower as running a WordPress site, which involves interpreting PHP (inherently inefficient, throws away 95% of your CPU power in exchange for flexibility and easy end-user programmability) and accepting user comments from a substantial fraction of users. It does require maintaining hundreds of thousands of open connections for Comet, which WordPress doesn't, but that's a manageable problem ever since kqueue landed in FreeBSD and epoll landed in Linux. It's not 01999 anymore.
Let's do an estimate of database size. 100_000 flights a day means about 32768 flights at any given time. You might get an update on each of these flights once a minute, so maybe 720 updates per flight, maybe 16 kilobytes per flight. That's 512 megabytes for the entire database. Not only can you fit that in RAM now; you can fit that in RAM on a 286 from 01987.
If the way you're accustomed to building websites results in websites that crash under light load, maybe you should consider doing it a different way rather than criticizing people who tell you there's a better way to do it.
Though I guess you missed it, I did talk about caches in my comment (edge caches with instant invalidation is the service Fastly provides), which was 626 words, less than three minutes of reading. Calling it a "walltext" makes me think you'd die of a heart attack if you ever saw a book.
Again, you are taking massive simplifications. Flight information is only one dataset they manage. They also manage data on planes, their users' subscriptions, perhaps site analytics, etc. The cost of the storage goes beyond the disc size - you also need redundancy, you might have offline ETL jobs to enrich the data, etc. Quoting estimates of disc size and per GB storage costs is not sufficient to summarize their costs.
Further to the point, your reply (to my comment) is not addressing my reply at all.
> unlimited traffic = unlimited cost
To support additional traffic does not come free. Sure, the traffic:cost ratio is not linear, but I don't think you are making the point that supporting the additional traffic does not have a cost associated to it? Exactly how are you refuting my comment, if you are at all?
Site analytics can of course grow without bound, but collecting so much site analytics you crash your site? Thats dum. Design your site so that it sheds load by not collecting so much analytics if that's a problem. And load test it before it's the most popular source of information on an international diplomatic incident.
I never quoted any estimates of disc size. I linked a server with a terabyte of RAM. Are you seriously suggesting they might have more than a terabyte of data on their users' subscriptions and on planes? Offline ETL jobs? Come on, be serious.
Redundancy? Yeah, you should have two big servers, not just one. 12 hours of FlightRadar24's revenues.
Yeah, unlimited traffic would be unlimited cost, but this is not unlimited traffic, this is US$27 of traffic that should have been US$1 of traffic. If this were 01999, or if a billion people had swarmed their site instead of less than a million, you would have a point.
> That's 512 megabytes for the entire database. Not only can you fit that in RAM now; you can fit that in RAM on a 286 from 01987.
I'm an idiot, on a regular PC you can only fit that in RAM since 01999, not 01987. Not sure how I looked at "megabytes" and thought "kilobytes", since I'd just calculated it.
I agree you'd probably need more than a 4-euro VPS. Indeed, I said in my comment that my 8-core laptop wouldn't be enough; you'd need 2-5 of them, unless you were running more efficient web server software than that hack I wrote one weekend in under 1000 lines of code. (And you couldn't use it anyway; it only supports serving stuff from the filesystem, not Comet.)
Clearly the systems they use aren't that simple or they wouldn't have crashed under such a light load.
Do they really want lots of traffic? It's a very specialized website, it's not like the traffic will stick. The people who want to pay for their services are already customers. I doubt they care about random Joe
Their infrastructure might very well be setup in a way that can't scale.
They don't need unlimited scaling for the entire site, just enough to scale for one particular plane.
They make money from ads and subscriptions. Any such business pays for user acquisition and a % of users convert to subscriptions.
Even if all this is spiky traffic, their site probably has now had millions of first-time users. When traffic dies down, they will settle back at a higher than previous usual levels.
Bank accounts ultimately put a hard cap on scaling infrastructure. Considering most folks use this service for free, I can't imagine they'd want to let those free users DDOS their bank account into bankruptcy.