Hacker News new | ask | show | jobs
by ahhhhnoooo 66 days ago
Unlimited strings are a problem. People will use it as storage.

No, I'm not joking. We used to allow arbitrary paths in a cloud API I owned. Within about a month someone had figured out that the cost to store a single byte file was effectively zero, and they could encode arbitrary files into the paths of those things. It wasn't too long before there was a library to do it on Github. We had to put limits on it because otherwise people would store their data in the path, not the file.

3 comments

I remember someone telling me that S3 used to be similarly abused - people were creating empty files and using S3 like a key-value store somehow, so AWS just jacked up the price of S3 head-object API call to push people back to DynamoDB or whatever.
Just include filename size in file size for billing purposes?
Not sufficient, unfortunately. The strings for file paths are stored in wholly different infrastructure with wholly different optimizations. It probably lives in your database. You really don't want people just stuffing gigabytes into that, payment or no payment. Odds are you didn't plan your control plane around, "what if someone uses our strings as encoded data?"
They won't do it if it's not free
In the fine print, only to be used against bad actors (w/guarantee that filenames under x chars would never be charged), or that too problematic? building good faith into policy + "hiding" info...

Reason - to not overcomplicate or give appearance of nickel-and-diming

No, just charge for the amount of storage they use on your server. Not the amount of data you think you’re storing. In non-special cases these will be the same number.
Makes sense.

Would there be any engineering/management pushback on the customer side? "we have to write a tiny script", "this is non-standard" / "why are you the only ones who charge us for filenames?"

(have limited knowledge here)

Wow alright I have learnt something thank you