Hacker News new | ask | show | jobs
Backblaze Drive Stats for Q3 2022 (backblaze.com)
115 points by caution 1329 days ago
11 comments

> For all three, it seems their spindles, actuators, and media are starting to wear out after seven years or so of constant spinning.

Seven years is indeed a lot of revolutions if they're constantly at 5,000 RPM or more.

Are those drives really live for all that time?

I'd expect that Backblaze's business model requires lots of storage and not a lot of access/transfer - lots of write once/read never data. Their storage pods have dual 10Gb network ports, but with 60 drives in a box, each capable of 6 Gb transfer speeds, you can only full use about 4 drives at a time. Are the other 56 drives spinning at full speed all the time, or parked?

My understanding is that repeatedly stopping and starting the drives is far harder on them then keeping them running all the time.
Generally, it is. The initial spike of electricity does the most damage. In addition, a sitting drive can eventually see the lubricants settle and stick. Once the drive is spinning, there is virtually zero friction wear. That said, a drive spun up once every several months will likely have a very long life for storage as long as the lubricants were manufactured correctly. I still have old ~500 MB drives that still work. The main advantage of having them run 24/7 is you are likely to register warnings before the drive completely dies, giving you an opportunity to decide how to deal with it. A sitting drive can simply not start up again. Plan accordingly.
Do we know any HDD that are tuned for the drives longevity? Ignoring Energy usage, speed, latency etc.
Couldn't you implement a strategy of replication and tiering to reduce wear across all drives? Some drive groups could be put to sleep and an infrequent event, like the failure of another node or disk group, would then cause these drives to spin up.

I know it’s a little different, but AWS Glacier doesn’t keep all data “hot”.

That's why I never turn my machine off: https://i.imgur.com/lHyscGS.png
I am not familiar at all with Backblaze's storage software, but I am fairly familiar with Ceph. You are not going to get 6Gbps out of a spinner. Maybe, 1.5 or 2, if it is a fairly fast drive and the data is all sequential. Plus I believe they uses SAS expanders, not multiple HBAs, so the 6Gbps bus is divided further. So they probably can saturate maybe half the drives at a time, if not more, given their bandwidth. They would probably keep them spinning, spin up is pretty power intensive, and it is physically hard on the drives.
I notice in the pictures they're still using their Pod storage systems for imagery, didn't they switch over to a Supermicro chassis recently? j I'm curious if so what they settled on with density, 45-60-90 drives? Servers or Jbods etc.

Over the years, 45drives has put an extreme premium on their systems compared to supermicro. We have a half dozen or so of the storinator chassis and we've switched back to supermicro high density storage systems, including some of their top load 90 bay jbods. It's been more cost effective for us, i'm wondering if that's the same for Backblaze.

I've been a backblaze customer for around 10 years, it's been great for home systems. Always love this transparency in their reports.

It was Dell actually! https://www.backblaze.com/blog/next-backblaze-storage-pod/ (full disclosure I work for Dell)

They don't state exactly what they're using, only that they're "Dell Servers" which honestly aren't the densest things in the world. I would've thought they'd go for something like the PowerVaults which can fit 84 drives into 5U, which is a little bit denser than their current storage pod configuration (60 drives in 4U).

I did recently do a comparison with our low end servers compared to a JBOD array with the same amount of storage and the uplift for the server configuration was about 10%, give or take. That was limited to a total of 24 drives in 2U however, so not sure how that'd compare to the larger PowerVault arrays or JBODs.

I've been a customer of Backblaze as well for about 6 years now and agree, it's been brilliant for backing up my home network! The reports helped me choose the right drives numerous times over, love their work.

I love that bb is publishing this information but it may result in driving up the cost of the most reliable hdds which ends up costing them more with future drive purchases assuming they gravitate to buying more reliable drives in the future.
> it may result in driving up the cost of the most reliable hdds which ends up costing them more

I assumed publishing the stats would actually work to BackBlaze's advantage, because it means no HDD supplier is going to skimp on quality control on the order going to BackBlaze.

Unless the higher price commended by more reliable disks causes other manufacturers to improve reliability, thus making the pie larger.

Since they're always trialing other disks on small scale, this can only really benefit both them and humanity as a whole!

Remember that cloud providers get drives in bulk at lower cost than the consumer.

Also I think they know this and still post it for the community benefit. What people store in the cloud is important and drive failure is their business because they are storing that data.

Also the cloud providers and hyperscalers are extremely well equipped to handle drive failures without data loss, so they can afford to buy the cheapest even if it's less reliable. If you just have one drive, or even a pair, the calculus changes.
BB has been doing this for many years. Nobody has complained about this affecting the prices yet.
Well people knew about HGST even more BB's data were available.
226,697 drives. I wonder how many drives are used by AWS' S3 and the other big cloud storage vendors.
Yev here from Backblaze -> my guess? More :D
That's kind of obvious - I was wondering how many orders of magnitude more :-).
That I can't say. Part of why we started doing this was so others could chime in and we could compare notes but that hasn't really been the case and they continue to be locked silos for the most part. Maybe one day!
Looks like they have been moving away from Seagate to Toshiba in the last two years. I assume it's because of Seagate's quality issues.
In the past, they've said that availability and price are their biggest factors, as long as reliability is ok. Their architecture allows for a pretty high failure rate, especially if pre-failure indicators allow them to migrate data from the drive before it fails.

Moving to any particular drive probably indicates bulk purchasing availability at reasonable prices more than anything else.

They go over this in the article: below a certain failure threshold, which they don’t appear to have hit, it’s still cost effective to go with the less reliable drive if it’s cheaper.
Thanks for this. I just bought new 14tb drives for my synology nas, and I looked at your data when I did so.

Separately, my nas running RAID 10 worked as it was supposed to: I hot swapped the drives one at a time over several weeks, and went from 7tb storage to something like 24tb with no downtime or hassle. The device has been sitting in my basement for years, and it was cool to see the raid technology work as intended after all this time.

Hi @ttcbj I am on the market for a dedicated NAS device, and have heard good things here and there about the Synology brand. Would you mind sharing which model you have been using? Thanks!
You buy Synology for their Software. Otherwise their Hardware are comparatively weak for the same price.

One thing to note about Synology, is that they dont update their Linux Kernel over the life time of its product. I am thinking if they are making changes considering there are products being sold not long ago with very outdated kennel.

And considering the amount of polish and fixes inside BTRFS, you might want to look into it if it could be a problem for your use cases.

Generally speaking it is good enough for most consumers.

I am the OP, and I have a DS 918+ and I have been very happy with it. But I am not a hardware expert, so I don't claim to know what hardware you should buy.

What I like about synology is the operating system/software. You can connect to it with a web browser, and it has something akin to a traditional operating system interface that lets you control all its functions.

I have found it very good as a file server for business/home office use, and as a backup target for multiple mac computers using time-machine. It also has very good online support, both from synology itself, and lots of freelance youtube/web help.

The other thing I personally really like is its "hyper backup" feature, which replicates all your NAS files (or those you choose) to the cloud, so they can be restored later.

I have also found it to be totally trouble free. I just installed the drives (they have a list of compatible ones), choose my RAID, and it has worked seamlessly for many years (I have a DS218+ as well that has worked well). It sends me emails reporting drive health, confirming backup completion, etc.

Not OP but the biggest thing to look at amongst the models is Network connectivity speed requirements, Storage requirements (Size and Number of drives), and if you plan on running containers / applications on the nas.

The While pricier the DS18xx range are the most common I've seen (depending on your budget). If you plan on getting one used off ebay, look out for the 1815+ and others in the xx15 generation, they had faulty intel atom chips that ended up hosing the systems sadly.

I've got a couple DS1817 's with the DX517 expansion chassis added on (total of 18 drives). They've been great for storing movies / media but unfortunately they're a bit low powered for transcoding.

Thanks very much!
Also not OP, but another yes to Synology - I have also found them to be worth it. I've been running mine (DS1517+) for 5.5 years continuously now.

I wirelessly back up my Macs and then use Backblaze B2 for offsite backups of that. I've found by combining this with the snapshot features of btrfs that this has been reliable.

My Synology is wired to my home cat 6, and the only thing stopping me from getting 10GbE to the devices that need it most is the stubbornly high prices of 10GBASE-T switches (I keep hearing it has something to do with heat (?)).

If you choose to mirror your drives, you can achieve some really good read speeds, which is nice.

This is great to hear; thanks for sharing!
Hi, I would just like to point out that a PC/storage server that you build yourself, running TrueNAS or something is definitely an option and might be cheaper and better.
I sort of do this now...well, not trueNas...but a regular (though solidly running) PC...which has been on xubuntu for years, and i simply have a neat set of samba shares. I guess i was looking for something smaller, quieter, but mostly less power-sucking. Sure, i guess i can try and cobble something together out of NUCs, or somesuch small form factor PC...but was thinking about a dedicated NAS...but yeah, hardware concerns are a thing for me. But, right now, i still am leaning towards researching a dedicated, purpose-built NAS device.
Any predictions when SSD will take over HDD for storage from what I understand <$50/TB has been reached for SSD. With higher reliability and less electricity consumption I am wondering at what price it will be more economical to go to SSD completely.
In some market segments this has already happenned - laptops, desktops, fast servers. SSD prices came down substantially. They are predicted to go down further, but I don't think they will surpass HDDs at gigabytes/dollar. Flash manufacturers won't go as low with prices, they will rather restrict supply.

HDDs will continue to be used for big storage. They are cheaper per GB and power usage is not that much worse on average (only spinups are energy demanding). Looking forward to 100TB HDD drives.

They must be getting seagte drives at a good price since they keep buying them despite highest failure rates every quarter
It seems that Seagate's reputation still has not changed.
I recently got a Seagate SSD, I hope it'll be fine
The only part of a Seagate consumer SSD that is designed by Seagate is the sticker.
Still weird to me that they don't have any "escape hatch". WD has sandisk, but Seagate is basically pure HDD as far as manufacturing goes. I guess I'm too bearish on the long term future of hard drives but it still doesn't look promising.
TLDR: Only buy HGST.
I got curious how much storage this actually is... 2.6 Yottabytes, which is 2.6 million terabytes!
A million terabytes is an exabyte, not a yottabyte.
Sorry you’re right it’s 2.6 exabytes was going by 10s not 1000s. Can’t edit the above comment anymore. :/
terabyte -> petabyte -> exabyte -> zettabyte -> yottabyte in factors of 1k
Sorry you’re right it’s 2.6 exabytes was going by 10s not 1000s. Can’t edit the above comment anymore. :/