Hacker News new | ask | show | jobs
by makkesk8 703 days ago
We moved over to garage after running minio in production with about ~2PB after about 2 years of headache. Minio does not deal with small files very well, rightfully so, since they don't keep a separate index of the files other than straight on disk. While ssd's can mask this issue to some extent, spinning rust, not so much. And speaking of replication, this just works... Minio's approach even with synchronous mode turned on, tends to fall behind, and again small files will pretty much break it all together.

We saw about 20-30x performance gain overall after moving to garage for our specific use case.

2 comments

quick question for advice - we have been evaluating minio for a in-house deployed storage for ML data. this is financial data which we have to comply on a crap ton of regulations.

so we wanted lots of compliance features - like access logs, access approvals, short lived (time bound) accesses, etc etc.

how would you compare garage vs minio on that front ?

You will probably put a proxy in front of it, so do your audit logging there (nginx ingress mirror mode works pretty good for that)
As a competing theory, since both Minio and Garage are open source, if it were my stack I'd patch them to log with the granularity one wished since in my mental model the system of record will always have more information than a simple HTTP proxy in front of them

Plus, in the spirit of open source, it's very likely that if one person has this need then others have this need, too, and thus the whole ecosystem grows versus everyone having one more point of failure in the HTTP traversal

Hmm... maybe??? If you have a central audit log, what is the probability that whatever gets implemented in all the open (and closed) source projects will be compatible?
Log scrapers are decoupled from applications. Just log to disk and let the agent of your logging stack pick it up and send to the central location.
That isn't an audit log.
That's very cool; I didn't expect Garage to scale that well while being so young.

Are there other details you are willing/allowed to share, like the number of objects in the store and the number of servers you are balancing them on?