Hacker News new | ask | show | jobs
by ben509 2438 days ago
Disclaimer: I worked on an AWS service team.

This is, oddly enough, similar to a debate people have about consumers TV or Internet: should pricing be "unlimited" or "a la carte"?

AWS is combining all your networking charges into one lump "outgoing data transfer" fee. So it's heavily marked up in comparison to what they're paying for the outgoing data transfer, and you're not sure how much is profit vs. whether it's going to cover all their other costs.

So it might be fairer if AWS broke out separate line items for internal, incoming and outgoing data transfer, plus all the additional systems a customer uses.

I think AWS's billing is probably already on the falling side of diminishing marginal returns. That is, it's complex enough that more information would tend to hinder customers from getting the best price. Right now, if I plan to reduce my data charges, I have one variable to tinker with. If we expand this, it would mean I'm having to balance incoming / internal and outgoing charges. That sounds simple, but in terms of engineering it can be very complex.

The next claim is that this biases customers not to move. Of course, Azure and GCP have the same arrangement, so while you pay to move out of AWS, you don't pay to move in to Azure or GCP. So all the vendors are attempting to lock you in to their product, and at the same time trying to extricate you from their competitors, overall it's a wash.

So, yes, part of the motivation for egress charges is that ingress is a loss leader. But it's also true that egress is a metric that does, for the vast majority of their customers, directly translate into customer value. If there's a compelling case for doing it differently, someone should do it and see if it works.

7 comments

> If there's a compelling case for doing it differently, someone should do it and see if it works.

Cloudflare doesn't charge for bandwidth. I always throw cloudflare on top of anything I do, not because I really need a CDN or anything, but because the bandwidth cost would bankrupt me otherwise. The ceo of cloudflare gave the rationale on why they don't charge:

> There’s a fixed cost of setting up those peering arrangements, but, once in place, there’s no incremental cost. That’s why we have similar agreements to Backblaze in place with Google, Microsoft, IBM, Digital Ocean, etc. It’s pretty shameful, actually, that AWS has so far refused. When using Cloudflare, they don’t pay for the bandwidth, and we don’t pay for the Bandwidth, so why are customers paying for the bandwidth. Amazon pretends to be customer-focused. This is a clear example where they’re not.

https://news.ycombinator.com/item?id=20791563

According to Cloudflare, they do not have any bandwidth pricing arrangement with Microsoft for Azure users.

They also do charge for Enterprise plans, but instead of transparent pricing I got high-pressure sales techniques and black box pricing offers - which then anchored our rate so that as we grow past our current contract, we're forced to upgrade at any point with pricing based solely on our original negotiation.

Frankly, while I save money using Cloudflare over Azure's CDN right now, it's left a very sour taste in my mouth and I'll be jumping their ship as soon as I have time to find a suitable alternative.

> high-pressure sales techniques and black box pricing offers - which then anchored our rate

If you have the ability to shift your entire enterprise CDN away from them, why not first try renegotiating?

Cloudflare most certainly disables zones on the free plan that use excess bandwidth. Enterprise contracts are also negotiated based on transit and those prices mirror comparable CDN services.
All that tells us is that cloudflare has a different revenue stream. Amazon is a business and they are in the business of making money. If they weren't charging for egress bandwidth they'd just charge for something else.
"I think AWS's billing is probably already on the falling side of diminishing marginal returns. That is, it's complex enough that more information would tend to hinder customers from getting the best price. Right now, if I plan to reduce my data charges, I have one variable to tinker with."

No, it's two variables - the egress charges you refer to and the actual cost to store the data.

We[1] have found that it is, as you might expect, quite a bit simpler to charge for just the storage and forget about metering the usage/bandwidth/transfer.

So we have typically had our price point higher than the B2s or Wasabis of the world, but there's just one simple number to think about - and no potential for surprises in the billing.

I will admit to having a bit of concern over adding 'rclone'[2] to our platform and the potential for users to just burn bandwidth using an rsync.net account as a "transfer host" but that is why we peer with he.net and their cheap an plentiful 10gb pipes.

[1] rsync.net

[2] ssh user@rsync.net rclone s3:/bucket gdrive:/blah/blah

And how many PoPs regional interconnects, highly availabile, high throughout connections, cross continent highly available connections do you have? Do you detect failure across these connections? Do you detect grey failures? Do you have a team of infrastructure engineers to look after this network?
We keep all of those to an absolute minimum and avoid as much complexity in our infrastructure as possible.

Which is to say, each of our five[1] regional POPs have a single connection provided through a dumb switch one hop from he.net[2].

They have no interconnection or dependencies to one another.

No routers, no firewalls, no balancing, no failover. When rsync.net fails, it's a very, very boring failure.

We've had zero network outages in the last 60 months or so.

[1] Fremont, San Diego, Denver, Zurich, Hong Kong

[2] init7 in Zurich ...

> So it might be fairer if AWS broke out separate line items for internal, incoming and outgoing data transfer, plus all the additional systems a customer uses.

The example given above for comparison, Hetzner, also doesn't charge for inbound and internal transfer AFAIK. Nor "the additional systems a customer uses". You pay a charge for the server, you get some amount of traffic included, and if you go over, the additional traffic costs something like $1.1/TB. That's all you pay.

> AWS is combining all your networking charges into one lump "outgoing data transfer" fee

> So it might be fairer if AWS broke out separate line items for internal, incoming and outgoing data transfer

This explanation doesn't cut it for me - most (all?) "traditional" VPS providers don't charge for ingress traffic, and I doubt anyone, ever, has charged for internal traffic.

So what exactly is 'all the networking charges' comprised of, other than egress data?

AWS is vast. I have no idea what their overall accounting for networking looks like. Even for the tiny service I worked on, it would be tough to guess at what are overall costs were. We actually had an internal bill each month for all the regular AWS services we used, but then there were a host of internal services we depended on.

That companies don't charge for specific things doesn't mean those things don't cost them anything. It just means they're trying to work out a pricing scheme that scales with customer usage and is broadly understandable. So "data egress" is really just a proxy for "how much stuff you're doing with the networking subsystems of AWS."

Same thing with EC2, there are a whole pile of costs that are summed up with "time you rented an instance."

See a lower comment I made here; what I really want is a little transparency about pricing.

Of course there are is an internal cost of doing business, and peripheral infrastructure cost - but if I pay $100 for service "A" I reasonably expect that fee pays for service "A". Instead, egress bandwidth costs seem to be used to trick customers into thinking services are cheaper than they really are.

How many of these VPS providers are actually managing global highly availabile network infrastructure?
I would have presumed that the infrastructure cost for each service was included in the cost for each service.

For egress bandwidth costs, I'd assume it included, well, the egress bandwidth cost.

I guess there aren't that many global network providers - I'm not even sure how much fiber Amazon owns in Japan, Australia or Northern Europe for that matter.

But I think level3 is associated with:

https://www.centurylink.com/business/hybrid-it-cloud/public-...

And while they have a call-us price list (if you have to ask...) - they at least state:

"Public and Private high-capacity networking options up to 10Gbps. Note: there is no charge for internal data center traffic. Cost on a per-GB-out model"

I have no idea what they charge pr gb for this cloud product however.

When you do outbound networking, you're sending it out to the internet, not mangling it within AWS's network
I get your point, but then why does AWS charge for inter-az traffic? That seems like an "egress but not really" kinda thing. If AWS/GCP stopped charging for this, customers would be incentivized to build HA systems and distribute their workloads across AZ's, which are a win for both customers and you (since capacity is now spread instead of stuck in a zone).
The doctrine for HA is that each AZ should be fully independent, and if you do that, your inter-AZ traffic is relatively minimal.

And I think the charges for inter-AZ transfer are to incentivize customers to do that.

Of course, to make them fully independent, you have to replicate everything, so you wind up buying several redundant copies of your system...

> Of course, to make them fully independent, you have to replicate everything, so you wind up buying several redundant copies of your system...

Yeah, and keeping around warm systems ready to failover in case of a zonal outage seems like a preposterous waste of resources.

The alternative... to keep around multiple replicas of your system in different zones, all ready to accept traffic and which do serve traffic, seems more practical and less wasteful.

This. Paying for n+2 capacity is really expensive when n=1, but pretty reasonable at n=5 or n=10. Until someone gouges you on data transfer …
It's not a waste when it serves a purpose. Availability is a big concern
Sure, from the perspective of availability, it makes sense to keep this around. The US military has redundancies in place to handle many kinds of adverse scenarios, which comes at a price, but is justifiable. The point I'm trying to make is that if availability is the _only_ value, it becomes hard to justify that if you're a scrappy for-profit corp looking at your bottom line.

If instead of availability being the only value, there would be a more value provided from actually using such resources, more folks would adopt cross-AZ architectures which would be a win-win for both the customer (get HA for lower or no cost and go down less often and succeed in the market) and thus the cloud provider (keep raking in the steady cloud revenue as the customer grows).

> so you wind up buying several redundant copies of your system

This is one of those gotcha's that company's hit. They see the public pricing page and think "wow that is much cheaper than one my internal IT department charges for X", and then when they go to actually implement they find that "best practice" says they basically have to more than double or even triple the cost to get a reliable system (more because not only do you have to duplicate all the infrastructure into a second AZ, you are getting charged for the replication traffic between them).

But in all fairness, if you actually implement that ”best practice” HA infrastructure, you will also be miles ahead of almost all internal IT departments.
Inter-AZ traffic is Metro Area Network ("MAN") traffic. There is a cost for running network between locations.

This explains the cost.

Price probably should be based on value, not on cost. Why do they charge for it? Because they decided it's a good way to make money and profit.

Something has to pay for the fat pipe between colos ( AZ's)
But those are probably far below $1/TB. Otherwise other colocation providers couldn't offer that even with peering.
The bandwidth alliance has a response to this approach: https://www.theregister.co.uk/2018/09/26/cloudflare_bandwidt...
Outbound bandwidth also happens to be an excellent place to put any markup you can, as that also locks people from migrating away so easily