Hacker News new | ask | show | jobs
by cj 1590 days ago
Off topic: for people with a "million billion" objects, does the S3 console just completely freeze up for you? I have some large buckets that I'm unable to even interact with via the GUI. I've always wondered if my account is in some weird state or if performance is that bad for everyone. (This is a bucket with maybe 500 million objects, under a hundred terabytes)
9 comments

I suggest you raise a support ticket.

AFAIK there is server-side paging implemented in the List* API operations that the Console UI should be using so that the number of objects in a bucket should not significantly impact the webpage performance.

But who knows what design flaws lurk beneath the console.

Curious to know what you find.

Does it happen only on opening heavy buckets? or the entire S3 console? Different Browser / incognito / different machine ...dont make a difference?

Yes, and sometimes even listing can take days.

I worked somewhere that a person decided using Twitter Firehose was a good idea for S3. Keyed by tweet per file.

Ended up figuring out a way to get them in batches and condense. Ended up costing about $800 per hour to fix coupled with lifecycle changes they mentioned.

> Yes, and sometimes even listing can take days.

You have a versioned bucket with a lot of delete markers in it. Make sure you've got a lifecycle policy to clean them up.

Doing an S3 object inventory can be a lifesaver here!
I'm curious. If you have a bucket with perhaps half a billion objects, what is the use case that leads you to wanting to navigate through it with a GUI? Are you perhaps trying to go through folders with dates looking for a particular day or something?
I have millions (about 16m PDF and text files) of objects and it's completely freezing
In my previous company we had around 15K instances in a EC2 region and the EC2 GUI was unusable if it was set on the "new gui experience" so we always had to use classic one. The new one would try to get all the details of them so once loaded it was fast. But to get there it would take many minutes or it would just expire. Don't know if they've fixed that.
Honestly this is when most folks move to using their own dashboards, metrics, and tooling. The AWS GUIs were designed for small to moderate use cases.

You don't peer into a bucket with a billion objects and ask for a complete listing, or accounting of bytes. There are tools and APIs for that.

That's what I do with my thousands of buckets and billions of files (dashboards).

It's also the reason why some AWS product teams have started acquiring IDE- or CLI-type of start-ups. They don't want to be boxed in by the constraints of the AWS Console - which is run by a central team. For example, the Redshift team bought DataRow.

Disclosure, co-founder here, we're building one of those CLIs. We started as an internal project at D2iQ (my co-founder Lukas commented further up), with tooling to collect an inventory of AWS resources and be able to search it easily.

Product teams don't do acquisitions. And thats not why that acquisition happened.
Just checked, out of curiosity. A bucket at $WORK with ~4B objects / ~100TB is completely usable through the console. Keys are hierarchal, and relatively deep, so no one page on the GUI is trying to show more than a few hundred keys. If your keys are flatter, I could see how the console be unhappy.
Sort of related, I faced such an issue when I had a gui table that was triggering a count on a large object set via sql so it could display the little "1 to 50 of 1000000". This is presumably why services like google say "of many". Wonder if they have a similar issue.
the newer s3 console works a little better. It gives pagination with "< 1 2 3 ... >"