Hacker News new | ask | show | jobs
by develatio 881 days ago
I'll share my trick :)

Lightsail instances can be used to "proxy" data from other AWS resources (eg EC2 instances or S3 buckets). Each Lightsail instance has a certain amount of data transfer included in it's price ($3.5 instance has 1TB, $5 instance has 2TB, $10 instance has 3TB, $20 instance has 4TB, $40 instance has 5TB). The best value (dollar per transferred data) is the $10 instance, which gives you 3TB of traffic.

Using the data provided by the post:

3TB worth of traffic from an EC2 would cost $276.48 (us-east-1). 3TB worth of traffic from a S3 bucket would cost $69.

Note: one downside of using Lightsail instances is that both ingress and egress traffic counts as "traffic".

4 comments

https://aws.amazon.com/service-terms/

> 51.3. You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services (e.g., proxying network traffic from Services to the public internet or other destinations or excessive data processing through load balancing or content delivery network (CDN) Services as described in the technical documentation), and if you do, we may throttle or suspend your data services or suspend your account.

A had a suspicion that this was against AWS's terms, but I never bothered to look if that was actually the case. Thank you for the heads up!
It’s mostly in there to scare people into not doing it. AFAIK they’ve never taken action on that.

Of course if you abuse it, you’re asking for trouble.

As someone who has dealt with users who use a system in an unintended way, you don't go looking for those people and you don't build something to enforce a policy like this. When you're running services for lots of customers, you often don't know a lot of what's going on in the system and how people are using it. Then something seems weird or something is causing a problem and you want to deal with it - and you want the language out there so that you can deal with it.

In Amazon's case, their bandwidth pricing isn't really defendable. It's just crap. However, sometimes you're trying to offer something reasonable, but need to make sure that a customer doesn't end up abusing something. For example, Chia is a cryptocurrency that will basically wear through SSDs (it's a proof-of-space system). There aren't explicit limits on how frequently you can write to a disk from most hosting providers, but Chia goes beyond what normal usage would do to a disk. Chia farmers would rather burn someone else's SSD that they're renting than their own. But no one at most hosting providers was probably looking at how frequently people were writing before noticing "hey, why are the disks failing faster than we'd expect?"

They probably haven't taken action on it because they probably haven't noticed it being a problem. But if you're a whale of a customer and suddenly your data transfer charges drop off a cliff, someone might end up looking into that and seeing what's going on.

At least AWS is fully aware how premium their normal data transfer is and that one might want to optimise those costs.
It's extreme enough that I never willingly serve data directly from AWS without a caching proxy elsewhere in front unless the egress is tint.

It takes very low hitrates before it pays for itself several times over including management overheads.

Sometimes you can justify a complete replica outside AWS (one of the things I will gladly pay AWS for is durability)

Yeah, but "service terms" are just recommendations that should often be ignored.
> You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services

This requires proving the users intent, which is not obvious except in the most blatant of cases (i.e. using Lightsail as a bent-pipe by writing the exact bytes you're reading). If it is a "CSV to Parquet translation layer", how would AWS possibly prove it's anything other than what it claims to be? You'd be paying a few more cents for compute, but that's the price of plausible deniability

> This requires proving the users intent

Companies are permitted to deny service to anyone at any time for any (non-protected) reason. They typically don't have to justify service terminations to a court of law. Who would they be required to prove user intent to, and why?

I don't think you parsed their message correctly. It's not about litigation.

Re-posting a bit of the service terms for easy reference:

> 51.3. You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services [...]

As you point out, they may terminate your service without any justification in a court of law. So how do they go about terminating the offenders? Well, one trivial way (from a technology and/or policy perspective): terminate everyone's service! If you blindly terminate everyone's service, that will certainly prevent anyone abusing LightSail.

But that's, uh, not good for business. So they probably want to terminate the service of only those people actually abusing it. But how do you do that?

You'd have to look at each account's usage and do something to determine if that traffic is or isn't a means of avoiding data fees from other services. In other words, you'd have to determine the intent of that traffic. Or, put yet another way: "this requires proving the users intent".

If doing so was as trivial as detecting any traffic between LightSail and the other services, they'd just prevent such connections in the first place. So how can AWS tell if some traffic between services is legitimate or not? The unspoken premise of the person you're replying to is that this probably isn't feasible for AWS to catch any and all people abusing LightSail in this way, with the conclusion being that you can (in practice) probably get away with it unnoticed.

We disagree on the definition of "prove". I would not object to the claim if it had used "determine" or "detect" instead of "prove".

That said, detection is easy. Look for users who spin up a Lightsail instance and use close to 100% of its bandwidth quota before spinning it down. Sort by number of such instances, and tell all users above some cutoff that in your sole discretion you believe they have violated your TOS, and are terminating their service. Doing so is completely legally defensible.

I hear you.

> We disagree on the definition of "prove". I would not object to the claim if it had used "determine" or "detect" instead of "prove".

I do find that a bit odd, though. If I consult the Merriam Webster dictionary, I see precisely one entry under "prove" that says anything related to law and/or courts:

> to establish the existence, truth, or validity of (as by evidence or logic)

> "prove a theorem"

> "the charges were never proved in court"

Even there, the only mentioning of court is in the example sentence, rather than the definition itself -- naturally, we want our court system to be based in reasoning rather than whim.

Additionally, the meaning of "prove" given by this definition is exactly what the study of formal logic sets out to codify, and given that this is hacker news (where many are interested/involved in computer science and/or formal logic itself), it seems counterproductive to ascribe some legal meaning to the word "prove" here, as it would (to my mind, at least) be quite unlikely for others to do so.

GP here - feel free to replace "prove" with determine because that's what I meant. My point was that it is really hard for Amazon to detect data exfiltration when its disguised as some other run-of-the mill service. Amazon can cancel anyone's service at anytime, but they can't afford to piss off legitimate customers with capricious, undeserved bans due to false positives. Regardless of where AWS draws the line to separate abuse from legit usage,it will always be possible to skirt underneath it. The crux of my argument is that AWS will tolerate false negatives over false positives.
I always assumed that your free quota is proportional to the time you pay. Even the price is not the advertised fixed $3.5, you pay less in months with 30 days than in months with 31 days.

I have not checked my cost and usage reports every time I have some experimental instance for a shorter time, so I am not sure. Just from the general knowledge that AWS is permanently counting every fraction of a peanut. But as the submission shows, exceptions to the rule can exist.

This is not a court. Amazon does not have prove anything to anyone.

There is going to be a program that will have rules to detect patterns in customer traffic and automatically block when those patterns are tripped.

At best you could complain in the forums and maybe if you are lucky a sympathetic community manager may look into your use case.

> Amazon does not have prove anything to anyone.

True. And no one has said that they must prove anything to anyone.

Amazon wants to make money, so they probably don't want to terminate the service of people who are acting in good faith. But that's just another way of saying that they probably want to determine with some certainty that someone is not acting in good faith before terminating their service.

So it's not that Amazon needs to prove anything to anyone. But they do want to prove something to themselves.

In this case they are actually losing money not gaining by allowing this kind of abuse, both because the bandwidth usage costs money and also because of potential lost billing from other services which now is not billed.

The Lightsail style billing model works same way shared vs leased lines works, if everyone fully used their max allocation it won't be possible to offer service at that price point. They can offer 2TB or 4TB for price because the usage modelling of target users supported that.

No company wants a customer to bypass their usage and pricing ToS even if they are not actively enforcing it, it is lost revenue and/or bringing in customers who you don't really want.

> In this case they are actually losing money not gaining by allowing this kind of abuse, both because the bandwidth usage costs money and also because of potential lost billing from other services which now is not billed.

Your statement here is absolutely correct (we are in agreement); it is also absolutely orthogonal to what I (and others) have said.

Let me use an analogy.

Marijuana is illegal in most states of the US (and, federally, it is still a controlled substance). And yet a (relatively) recent survey[1] showed that around 7% of respondents grew marijuana at home.

How is this possible? Shouldn't that be 0%? It's almost like the DEA is slacking off or something.

... or maybe it's because they can't practicably round up each and every one these people: the DEA isn't omniscient, and given the 4th amendment they can't ransack every home within the US to catch these people. If you don't do something that gives them sufficient evidence to acquire a search warrant, there's nothing they can do about you growing pot in your domicile.

Back to Amazon. Could you, at a high level, describe a process by which they could, for a given account, determine if that account's use of LightSail is legitimate, or is instead intended to avoid incurring data fees from other services? And you must satisfy some additional, absolutely crucial qualifications: this process must not negatively impact abiding users (because they would abandon AWS, resulting in financial harm to AWS), the cost to AWS of executing this process must not be prohibitive (in terms of compute, human resources, etc), and the process must be applied across all accounts within a reasonable time frame (if it takes 1 year for AWS to comb through 1% of accounts, that means you have a mere 1/100 odds of having your service terminated for abusing LightSail for an entire year).

Something being prohibited doesn't imply that it is practicably, fully enforceable.

[1]: https://pubmed.ncbi.nlm.nih.gov/36288408/

Btw. Hetzner charges 1.19 €/ TB or so if you exceed the 20 TB you get even with the 4.5 €/ month VM. So obviously, AWS is charging disproportionately compared to what it actually costs to deliver that bandwidth/ traffic. These charges are great at preventing a good deal of piracy probably but they also prevent some startups from offering competitive prices or implement a simpler system.
Here's another one:

You can download 1TB of data for free from AWS each month, as Cloudfront has a free tier [1] with 1TB monthly egress included. Point it to S3 or whatever HTTP server you want and voila.

[1] It used to be 50GB per month for the first 12 months. It was changed to 1TB free forever shortly after Cloudflare posted https://blog.cloudflare.com/aws-egregious-egress

That "shortly" was 2 years, that Cloudflare post had nothing to do with it, Amazon barely considers them a competitor to begin with.
Sigh, I should have quoted both side, here is it: https://aws.amazon.com/blogs/aws/aws-free-tier-data-transfer...

Cloudflare rant was posted in July 2021, new Cloudfront free tier was there Nov 2021. I consider changing pricing like this for AWS in 4 months pretty fast.

Where is your 2 years number comes from?

Nice!

Nitpick: $5 for 2TB is better than $10 for 3TB.

Ooohhh!! It is, indeed!
Nice trick but you are playing with fire due to the AWS' terms.