Hacker News new | ask | show | jobs
by maccard 1177 days ago
> I think this is a easy mistake to do even with very good intentions, and I can see myself doing it.

Hard disagree. You missed one very important part in your writeup - at no point did they communicate that they were imposing this limit, and that this limit appearead, undocumented, overnight.

I was someone who was directly impacted by this change. We're a 40 person company who used (past tense) GDrive as a shared network drive, including for storing builds of our app. We pay $18/person, and as part of that, google workspace advertises 5TB per user pooled[0], and nowhere in the google docs does it mention that this limit will exist [1]. If I was aware of a limit, we would have cleaned up our old files, but instead we started getting spurious 403's - as far as we could tell we were well within our usage limits. It was only when https://issuetracker.google.com/issues/268606830?pli=1 this post hit HN, I realised what was wrong.

[0] https://workspace.google.com/intl/en_us/pricing.html

[1] https://support.google.com/a/users/search?q=Drive%20limits

2 comments

> at no point did they communicate that they were imposing this limit, and that this limit appearead, undocumented, overnight.

Not to defend google but I've seen plenty of engineers make such mistakes, and you probably have as well; it's just that it didn't then result in bad press.

When you are an engineer working on a living product, and you identify some performance-related issue, changes you make to the product can easily be classified as bugfixes. For example, you identified an end point that should have a rate limit and didn't; you fixed it, it was a potential security issue, it didn't need communication to end users... as far as you knew, even if you misjudged.

Strong XKCD1172 vibes here. https://xkcd.com/1172/

> When you are an engineer working on a living product, and you identify some performance-related issue, changes you make to the product can easily be classified as bugfixes. For example, you identified an end point that should have a rate limit and didn't; you fixed it, it was a potential security issue, it didn't need communication to end users... as far as you knew, even if you misjudged.

Sure, and the vast majority of companies publicise changes that affect customers. Infact, google do it quite regularly. If you roll out a customer impacting change, even if it's a small number, you communicate it. Docker's recent (mis) communication is a good example of what's required. If you can't take the heat, get out of the kitchen.

I've still yet to see google acknowledge that they've done or rolled this back. Even this HN topic is about a tweet that says:

> We recently rolled out a system update to Drive item limits to preserve stability and optimize performance. While this impacted only a small number of people, we are rolling back this change as we explore alternate approaches to ensure a great experience for all.

i.e. not that they imposed an undocumented limit.

> was a potential security issue, it didn't need communication to end users...

If there's a security issue in a customer facing part of a product, and you change that part to introduce a limit, you communicate that you've done that. Coming in one day to find out that the rate limits have changed and you've not been notified about it is a sure fire way to piss off a whole bunch of people.

> For example, you identified an end point that should have a rate limit and didn't; you fixed it, it was a potential security issue

That sounds careless. Any such change would need to have a impact analysis (which should be part of the team/org/company's SDLC). In this case, communication should be sent out to the clients of that endpoint, with a reasonable deadline, before enforcing any rate-limit.

You see this if you troll through kernel logs or any enterprise piece of software; pages and pages of warnings like this:

BOBFLANGLE IS DEPRECATED AND MAY BE REMOVED, PLEASE REDUCE THE BOBFLANGLE USAGE BELOW 1.5 MILLIBOBS

Then if it actually becomes an issue, you can pull logs from thousands/millions of systems, and determine the extent of actually removing the BOBFLANGLE and begin mitigation.

To add to the point: 5M files on 5TB storage would average 1MB per file

So just storing 5TB of average web sized images would hit the limit, let alone smaller documents.