|
Wow, I'm surprised it took AWS this long to (mostly) catch up to Azure, which had this feature back in 2015: https://learn.microsoft.com/en-us/rest/api/storageservices/u... Azure supports 50,000 parts, zone-redundancy, and append blobs are supported in the normal "Hot" tier, which is their low-budget mechanical drive storage. Note that both 10K and 50K parts means that you can use a single blob to store a day's worth of logs and flush every minute (1,440 parts). Conversely, hourly blobs can support flushing every second (3,600 parts). Neither support daily blobs with per-second flushing for a whole day (86,400 parts). Typical designs involve a per-server log, per hour. So the blob path looks like: "{account}/{path}/{year}/{month}/{day}/{hour}_{servername}.txt"
This seems insane, but it's not a file system! You don't need to create directories, and you're not supposed to read these using VIM, Notepad, or whatever.The typical workflow is to run a daily consolidation into an indexed columnstore format like Parquet, or send it off to Splunk, Log Analytics, or whatever... |
Microsoft had the benefit of starting later and learning from Amazon's failures and successes. S3 dates from 2006.
That being said, both Microsoft and Google learned a lot, but also failed at learning different things.
GCP has a lovely global network, which makes multi-region easy. But they spent way too much time on GCE and lost the early advantage they had with Google App Engine.
Azure severely lacks in security (check out how many critical cross-tenant security vulnerabilities they've had in the past few years) and reliability (how many times have there been various outages due to a single DC in Texas failing; availability zones still aren't the default there).