Hacker News new | ask | show | jobs
by rsync 1730 days ago
This is interesting ...

For the longest time we tried to convince people that they should have an off-amazon archive of their S3 data ... we even ran an ads to that effect in 2012[1].

The (obvious) reason this isn't compelling is the cost of egress. It's just (relatively) too expensive to offload your S3 assets to some third party on a regular basis.

So if R2 is S3 with no egress, suddenly there is a value proposition again.

Further, unlike in 2012, in 2021 we have really great tooling in the form of 'rclone'[2][3] which allows you to move data from cloud to cloud without involving your own bandwidth.

[1] The tagline was "Your infrastructure is on AWS and your backups are on AWS. You're doing it wrong."

[2] https://rclone.org/

[3] https://www.rsync.net/resources/howto/rclone.html

4 comments

> So if R2 is S3 with no egress, suddenly there is a value proposition again.

That doesn't appear to be what they're doing, they don't seem to have changed their existing operating model at all:

> R2 will zero-rate infrequent storage operations under a threshold — currently planned to be in the single digit requests per second range. Above this range, R2 will charge significantly less per-operation than the major providers. Our object storage will be extremely inexpensive for infrequent access and yet capable of and cheaper than major incumbent providers at scale.

What I read this as is "we won't bill you until your traffic spikes, then you'll pay us, oh how you'll pay us"

Transparent bandwidth pricing would be a far more interesting announcement. This is the second post I've seen from CloudFlare in recent months throwing bricks at AWS over bandwidth pricing, while failing to mention CloudFlare bandwidth is some of the most expensive available.

The way I read it is that for low-scale users, they're not going to have request pricing. For higher-scale users, "R2 will charge significantly less per-operation than the major providers". AWS charges $0.0004 per thousand GET requests. Let's say that R2 charges $0.0003 per thousand GET requests. That's still cheaper than AWS or Backblaze's B2 (even if just barely) and if they're not charging for bandwidth, then it's really cheap.

The announcement says that they're eliminating bandwidth charges three times.

I don't know the whole economics around cloud storage and bandwidth so maybe this is unrealistic pricing and your suspicions are well founded. However, Backblaze seems is offering storage at $0.005/GB and bandwidth at $0.01/GB. Cloudflare is charging 3x more than Backblaze for the storage and $0 for the bandwidth. Given that Cloudflare's costs are probably lower than Backblaze for bandwidth, that doesn't seem so unreasonable - but I could be very wrong.

I think Cloudflare probably sees R2 as something that is sustainable, but creates demand for their enterprise products. You start The NextBigThing with R2 and suddenly your application servers are under attack. You have a relationship with Cloudflare, you're used to their control panel, you trust them, and when you're at the scale that you're getting attacked like this you can drop $10,000/mo because you're bringing in a bunch of revenue - $10,000/mo is less than 1 software engineer in the US.

R2, in a certain way, can be a marketing tool. "Come use our S3 competitor with free bandwidth rather than getting locked into AWS's transfer pricing." 6-12 months go by and you're substantially larger and want more complex stuff and you're already getting emails from Cloudflare about their other offerings, you see them in the control panel, etc.

It seems like Cloudflare might be trying to move in on AWS's market. R2 is an easy way for them to do it. It seems like S3 has high margins. Competing storage services can be a fraction of the cost per GB and AWS's bandwidth markup is incredibly high. If you're looking to attack a competitor's market, it seems like going after one of their highest-margin product could make the most sense. Again, R2 becomes a marketing tool for future cloud offerings.

Part of Cloudflare's strategy might be targeting things that they see very high margins on and being willing to accept lower margins. If something has 50% margins and you're willing to accept 20% margins, you're still doing pretty great. Plus, over time, the cost of hardware comes down and you can keep your prices at the same level once people are happily inside your ecosystem and don't want to deal with migrations.

> CloudFlare bandwidth is some of the most expensive available

It sounds like you might have gotten burned by something with Cloudflare. I don't have any horror stories, but I'm always interested in new data points if you have them.

> Given that Cloudflare's costs are probably lower than Backblaze for bandwidth, that doesn't seem so unreasonable - but I could be very wrong.

At scale, bandwidth capacity purchases are symmetric - you buy the same amount up as you do down. As a provider of DDOS protection services, Cloudflare has to maintain a huge amount of ingress capacity - meaning they have a ton of egress capacity sitting unused.

According to PeeringDB, they're mostly outbound [1]. I guess they already serve so much traffic that they don't need extra inbound capacity.

[1] https://www.peeringdb.com/asn/13335

This is about the operation, not about bandwidth the way that I read it. All providers have prices for bandwidth and prices for different "tiers" of operations (store, retrieve, delete, list, etc). The way I read it is that bandwidth is always 100% free, and storage operations are free under a certain threshold. I hope I'm right ;)
This is correct. Bandwidth (ingress and egress) always free, regardless of volume. Transactions free at low volume (~<1/sec) but we’ll charge at higher volumes. Storage we charge for. For both transactions and storage, we aim to be at least 10% less expensive than S3. And, again, for the sake of absolute clarity: egress/ingress always free.
Does this mean we should contact our enterprise account manager regarding our existing spend? For the sake of absolute clarity: we're currently paying for bandwidth
> So if R2 is S3 with no egress, suddenly there is a value proposition again.

Isn't B2 from Backblaze already filling that need? I means more choice is always better for sure, but considering R2 goal seems really to be a CDN more than a backup space and it does feel like their money maker is in the CDN part, not the storage part... I feel like trusting them to store it long-term without using the CDN part is a little bit risky.

B2 charges for egress, $0.01/GB. More interesting to me is Wasabi, which charges ~$0.006/GB*month and no egress fees at all.
Wasabi egress is free, but they "reserve the right to limit or suspend your service"[0] if it doesn't fit the usage patterns they've designed for.

[0]: https://wasabi.com/paygo-pricing-faq/#free-egress-policy

Huh, thanks for that. I hadn't noticed (was that always there??). So, sustained egress at rate R/sec means I have to use 2500000 * R amount of storage per month, hrm...

Possibly can't use it for one of the "ponies" I was working on, but probably still good as "ye huge media archive".

Wasabi doesn’t charge for egress (‘fair use’ policy applies), but they do have a 3 month minimum for data, including deleted data.

This caught me out when I was transferring 100GB files that only needed to be up for a few hours, and I ended up getting charged as if I had hosted them for 3 months.

B2 doesn't charge for egress to Cloudflare. (https://www.cloudflare.com/bandwidth-alliance/)
(stupid question) What does that get me? Does Cloudflare have cheap VMs I can rent with competitive prices? (including data egress fees?)

For object storage that I play with, some clients are in the cloud, with most sitting at home behind residential internet.

(approximately) Cloudflare provides a proxy service that you'd use to access your B2 data from home or other cloud without paying for egress.

They can do this because it costs almost nothing to move data between B2 and Cloudflare, and then from Cloudflare to almost anywhere. Moving data from B2 to most other places on the internet likely costs them more because Backblaze isn't in a position to negotiate adventagous peering agreements with ISPs.

Note that you can't use a free Cloudflare account just for things like images, video and other binary files, as they'll suspend the account. It must be used primarily for a website, not content hosting. If you only want to use Cloudflare for files, you need a paid account.
In addition, you need to use Cloudflare web workers if you want any sort of access controls. (I think this is part of why it makes financial sense for Cloudflare to do this)
Wow! Cool! Very surprised that Cloudflare wouldn't charge an arm and a leg for such a service... considering they're moving the actual bits.

I'm poking around at the Cloudflare website, what's the name of the aforementioned service? What term should I google?

I'm ignorant of "modern Cloudflare" -- other than reading their fantastic technical blog, I've never used them in a professional capacity and don't know their product offering -- other than a cache, CDN, DDOS protection, and a Lambda.

Oh yeah, I mixed ingress with egress.

Now though I don't understands the original comment... why care about egress for backup storage? I means as long as it's not absurd (and I agree that AWS egress price is absurd, though the original comment wasn't complaining of that...), you usually don't expect to have to retrieve it and if required, you are ready to pay much more for it as it will be worth it.

Frankly, backblaze looks like an over-specialized player whereas clouflare is already used for a lot of stuff.

Eg: my employer already has stuff on clouflare, using their services is just as easy as pulling their terraform provider. OTOH, for backblaze, I'd have to go through the whole evaluation process, security and legal compliance etc etc...

Backblaze has only two locations and they cannot be used with the same account. Your data is (within one account) always just in California oder Amsterdam. For many needs, having multiple PoPs is crucial.
Yev from Backblaze here -> we have more than two data center locations, but we do have two regions (US-West, which is spread out across California and Arizona and EU-Central which is in Amsterdam). Slight nuance, but very different!
You're right, but for me as a customer it doesn't matter. I can choose from two geographical locations and cannot use both at once with one account. So there's no option to get data close to my users.
You can front it with Cloudflare's CDN for free.
You're not allowed to use Cloudflare to just serve files, at least not in free/cheap tiers.
> 'rclone'[2][3] which allows you to move data from cloud to cloud without involving your own bandwidth

Maybe I'm reading this wrong, but the data does pass through the machine where rclone is running. rclone does support remote-to-remote transfers[0], but I believe only for remotes of the same type (ie S3 to S3).

[0]: https://rclone.org/docs/#server-side-copy

Does this mean "remotes that speak the S3 protocol", or "remotes that are S3"? The former would require S3 supporting induced requests (so it POSTed the data somewhere), the latter would require a "copy" operation on S3. I don't know which one is supported.
Server side copy only works within the same service provider, not across service providers.
Right because, how would that work otherwise?

Unless those providers natively supported this functionality (and they're incentivized to not) you'd _need_ an intermediary.

The [3] resource is fantastic! Have you tried sponsoring rclone? I was studying their docs last week, and I'm sure people reading the docs are interested in this use case of moving between clouds without using their own bandwidth.
Thanks - I made a point to have that document created not specifically with rsync.net use-case in mind.

You can follow that howto with any two cloud endpoints if you wanted to.