Hacker News new | ask | show | jobs
by james412 2099 days ago
Was worried about CF getting their claws dug into archive.org, but on reading, this is a decidedly non-evil deal, actually it sounds wonderful. Still, I worry if there might be some unseen long term interest in the archive.

Never forget Dejanews

6 comments

Keep in mind how Cloudflare makes most of their money: They sell a web proxy service with security and performance features including a CDN. Cloudflare's interests are furthered by improving that service in ways that help its customers. Keeping the Web Archive healthily stocked with content is aligned with their long term revenue growth.
T+10 years I very much expect CloudFlare's core business to have expanded significantly. I remember that time my Googler friend told me they were about to release that one thing they'd absolutely never do, Chrome came out a few weeks later, now look at Firefox

You need to pay attention to the silent positioning of these companies to even guess at where they might go, so deals with things like archive.org may have some unseen substance to them that might only become obvious much later

As a business they absolutely are not going to stay in the CDN lane as a primary.

Akamai has $3b in sales and an $18b market cap.

Cloudflare has $348m in sales and an $10.8b market cap.

Akamai is their maximum ceiling if they focus primarily on the CDN segment. Cloudflare is rapidly approaching their valuation ceiling if they stick to CDN as their core (and they'd have to start killing Akamai just to get there; the CDN business is increasingly a slower growth segment in the larger cloud industry).

Companies all around them in the cloud are growing faster, yet few are more important than Cloudflare. Zero question Cloudflare will continue to aggressively branch out, leveraging their critical positioning. In the not-so-distant future CDN will not be the center of their business. CDN is and will remain a springboard for them, a gateway drug, milk at the back of the grocery store.

Akamai is not their ceiling because Akamai doesn't serve all segments of the market.

I'm fairly critical of Cloudflare for a lot of resources, but one thing I think they did right was focus on the SMB market with plans that were actually affordable to the average business. They targeted customers that companies like Akamai pretended didn't exist. Even now they have the cheapest plan available, and once they consolidate the market even further they can start raising those prices.

Akamai is their ceiling in CDN because they have a much higher value segment of the business, representing a drastically larger share of all dollars in the CDN space. Their business is nine times the size of Cloudflare because their customers are far more lucrative.

If Cloudflare holds onto all of their already considerable number of customers, and then kills Akamai and somehow takes all of Akamai's business, the combination will be a mere 10% larger than Akamai already is now. There is your general indie ceiling in action, with all segments combined (and Cloudflare isn't going to monopolize the entire CDN business besides).

All you need to know to spot the independent CDN ceiling is that Cloudflare + Fastly + Akamai = $3.6 billion in sales (with the understanding that it's a slowly increasing ceiling, as the CDN market is still growing). The ceiling in that space for Cloudflare just can't realistically be much larger than that combined group and that's not much larger than where Akamai is already at. The only way this isn't the case, is if you project Cloudflare knocks off most competitors and takes the market (they can't, Amazon, Microsoft, Google among other giants, are standing in the way of that outcome).

It'll take Cloudflare a small lifetime to get to $3 billion in sales in the CDN space at the rate they're growing (they're adding ~$8m-$10m per quarter in growth (all of which obviously isn't CDN), so maybe it'll only take a few decades with some compounding). It took Akamai 22 years to get there with very high value customers and a pretty nice open field for many of those years.

Akamai in absolute dollar terms is growing faster than Cloudflare + Fastly combined. The CDN ceiling is actually running away from Cloudflare at present. That shouldn't be happening.

Cloudflare knows full well CDN isn't their brightest business future. It's why so much of their expansion effort is going into everything else. Given the way they price-structured their CDN from day one, Cloudflare has always known CDN was a lure and the upside was in sprawling outward from it. Come for the CDN, stay for the workers or whatever preferably higher margin thing we can sell you on. It's also why they're not interested in / worried about trying to make money on domain registrations, as with SSL before that. They'll happily murder the margins in foreign services all day long (areas where they don't compete, but there is margin to wipe out cost effectively, and with customers to lure in), so long as they can occasionally launch a new service where they have a distinct advantage and can convert their base to use it and increase total revenue per customer in the process.

What would be a better path: if Cloudflare could own a big part of Akamai's CDN business by trying to aggressively climb up the ladder from an unassailable price-value position Akamai doesn't want to go down to, like an ARM eating an Intel from the feet upward; or just leave the snoring giant alone to keeping snoozing in his enterprise tower while Cloudflare busies itself sprawling out in many directions, leveraging the volume of customers that Akamai doesn't want to (and or can't) go after because they're not viewed as lucrative enough? I think what Cloudflare can find outside of the CDN business, is likely to be more valuable than what's inside the CDN business, very long-term speaking.

And if you're Akamai and you let Cloudflare get far enough along with that sprawling (likely already too late), how about if they drop your CDN legs out from under you. Cloudflare builds out many other legs to stand on, so they flip the switch on the margin and kill the CDN market for the independents, as they were willing to do with domains and SSL. Free CDN, all tiers, all features. They can't do that today, they might be able to do it tomorrow. The CDN market becomes the SSL market, and as a totally free lure it accelerates a rush into Cloudflare's other more exclusive services (including for larger, lucrative enterprise customers). Surely this switch has been pondered inside of Cloudflare, road-mapped as a potential.

> As a business they absolutely are not going to stay in the CDN lane as a primary.

Yeah, and all the big five Cloud vendors: AWS, Azure, GCP, IBM, Oracle all have their own CDN solutions bundled. Hard to make a case to purchase separate CDN solutions.

I'm not sure about all the providers but Amazon's CloudFront CDN product has additional costs, so it's "bundled" but not in the sense that it's free, only that it's integrated.

And one of Cloudflare's selling points imo is the multi-cloud customers. Use AWS all the way but Cloudflare as your CDN and you could swtich to GCP seamlessly. Or route traffic based on pricing etc. I think you're right they will/have absolutely branch out from CDN but I think their CDN product is actually compelling especially to bigger companies that are more afraid of Amazon that they are of Cloudflare.

(Other interesting point - it's worth noting that IBM's CDN is essentially white labeled Cloudflare).

Great comment. Cloudflare is not a CDN. They are an edge computing platform that happens to offer CDN services. Could Akamai grow into that market faster than Cloudflare can consume it? TBD.
Edge computing is super interesting, and today's CDN providers should be able to provide it given their current infrastructure deployment. It could really bring in the next era of computing and technology once certain networks/providers reach critical mass to provide edge services within 5-10ms to customers.
If jgrahamc is reading this, I'd really like to know if Cloudflare wants to work with telcos.

Imagine a small server in every cell tower, with locally-cached maps/Wikipedia/latest movies.

Some communication couldn't be cached (e.g. real-time video calls), but a lot of broadcast media could be. Of course there are copyright implications, and it might require partnering with Netflix or others.

The quick load times would be great for users, and the reduced load on the backbone would be good for the telecom companies.

If you'd like me to chat to some friends in telcos in New Zealand about this, drop me an email. It's not my job now (I'm in IoT) but I know who to talk to if you'd like to get this kind of thing moving.

They control SSL decryption for a massive number of websites. Governments will gladly fund Cloudflare for eternity.
If your root CA is subject to the laws of a government that can take the root certificates and MITM the connection with those root CAs that's not much better. Cloudflare just makes it easier.
Certificate Transparency makes this significantly harder to do stealthily. I’m not convinced that Cloudflare is a deep state operation either, but Cloudflare's ability to secretly MITM is a position afforded to a select few, and certainly not every CA.
It's much easier (and virtually undetectable) to MITM when you are also the reverse proxy though.
Akamai as well then?
Much of the US Government already uses it, so yes.
More like a web blocker "service". It is profoundly unhelpful to me that a proxy service cares if I have Javascript disabled in my browser.
That’s the website that has manually enabled a feature if it requires Javascript. Cloudflare does not require Javascript out of the box.
Please clarify. I thought all those captcha puzzles were coming from Cloudflare. Are you saying they are only enabled if the destination page has JS?
I believe GP is referring to a setting that a cloudflare user has to flip for requiring visitors to enable JavaScript
What worries me is that Cloudflare is deanonymizing a huge load of TOR users, and the issue that comes with it is that a huge part of TOR users actually needs access to the web archive due to country-wide DNS censorships (European countries included).

As Cloudflare is deanonymizing TOR users pretty much with every website that's hosted on it, I fear they are abusing that power once again to deanonymize users of the web archive.

Cloudflare always claims it's not their issue and that it's a webmaster setting with the shitty captchas and Google's infamous Prism-sponsored PREFS cookie - but to be honest they should just not have implemented it in the first place if privacy was a core value of their company.

The "DDoS" protection basically fingerprints a machine and user inside an encrypted HTTPS connection; which makes the encryption tunnel itself obsolete.

Not long ago, CF has been blocking access from Tor. And they are blocking access from my web crawler sometimes. I don't like CF as they act as a police or gatekeeper to the origin website, deciding who to penalize and who do not, while pretending to be speeding up websites and protecting from 'threads'.
They’re acting more as a security guard. Which is to say that they’re intentionally employed by the owner of the property you’re trying to enter. Often specifically to “bounce” users like you, malicious or not. Believe it or not there are legitimate reasons for wanting only real human users on your website!
> while pretending to be speeding up websites and protecting from 'threads'.

They do though. That's why people pay them lots of money to do those two things. Not sure what part you think is "pretending"?

One of the first 100 people to use cloudflare when it launched.

Paying them today to speed up a couple of websites while protecting them.

They rock at making big things possible for very small companies.

Hey, me too! Do you have the first-users t-shirt?
I don't know really. Cloudflare is notoriously in conflict with different archive sites and now this announcement makes that sound not too credible.

I think we will see selective removal of certain content.

> Was worried about CF getting their claws dug into archive.org

SAME. From the title, I assumed the Wayback Machine would be using Cloudflare. Nice prank, boys.

When users are used to this (getting redirected to a archived copy when the site is down/not available) and when this trial baloon has been proved to work, Cloudflare will replace archive.org with their own infrastructure. This is the common game plan.
Uh, no. We're literally doing the opposite. We used to have our own caching infrastructure for "Always Online" and we're getting rid of it and using archive.org instead.
Thanks, so maybe this page is outdated where it mentions your own crawler with user-agent? Or does the Internet Archive use it for these crawls? https://www.cloudflare.com/always-online/
How do you handle robots.txt? The previous incarnation of Always Online didn't care about robots.txt, while archive.org does.
https://blog.cloudflare.com/cloudflares-always-online-and-th...

We tell archive.org about the URI, they crawl it. They handle robots.txt.

archive.org doesn't handle robots.txt in any meaningful way (see my comment above at https://news.ycombinator.com/item?id=24516875 ). If that's changed recently, I'd like to know more.
Note that archive.org stopped respecting robots.txt since 2017. [1]

In my experience, the site owner must email archive.org support to be excluded from its crawler and archiving.

[1]: https://boingboing.net/2017/04/22/internet-archive-to-ignore...

And thank god for it. Trying to explain to end users why their site was not, in fact, always online on account of the creaking behemoth that plodded along in IAD barely managing to successfully cache and serve anything ever was never any fun.

The original Always Online infra was long unloved and probably kept on life support far too long for lack of want to deprecate an early feature.

"We're literally doing the opposite."

How does what you do now contradict what you will do in the future? What legal assurances are there that you won't do hat when you leave? (See Facebook/Oculus "no Facebook account promise")

Wait... so you think Cloudflare's master plan is to roll this new thing out to get people to accept it as normal, and then suddenly make a big shift to.... what they currently have?

Why don't they skip this step and just keep what they have now, then? No one seems to be up in arms that they currently provide their customers offline caching...

Doesn't CF already have an "Always Online" feature using their own infrastructure? So this seems like the opposite happening.