Hacker News new | ask | show | jobs
by shiftpgdn 3439 days ago
My day job is working as an HPC Sysadmin on a decent sized supercomputer with petabyte scale storage for a private consulting company. I spend a lot of time dealing with and thinking about storage and honestly I don't think mechanical disks are long for this world.

For our next storage expansion it's ALMOST worth ditching storage tiering and going to an all flash/SSD configuration. There is so much hassle involved with mechanical disks relative to SSD. SSDs are by no means perfect but I don't have a steady stream of SSDs being pulled out of production due to mechanical failures.

5 comments

You can buy HDDs for $20 / TB. There are companies with millions of TB. Unless SSDs can meet that price point, HDDs have a very comfortable place in society.
It's not that simple. How much do you pay to keep those HDDs powered per TB per year? How much does maintenance cost (replacing drives etc.)? How does the low IOPS of those drives affect your workload?

SSDs may not win in every area yet, but if you only look at purchase price, you're not getting the right picture.

> How much does maintenance cost (replacing drives etc.)?

The maintenance on enterprise storage is generally a percentage of purchase price. So it's actually cheaper.

>How much do you pay to keep those HDDs powered per TB per year?

4.5 watts idle/8 watts max for a spinning drive vs. 4.5 watts idle/11 watts max for a large capacity SSD (15TB Samsung). The power consumption thing was a much better story comparing 3.5" 15k RPM drives. 7200 RPM drives it's basically a wash unless you're talking about relatively small capacity SSDs.

>How does the low IOPS of those drives affect your workload?

That's really the crux of the issue. SPINNING drives are not dead. FAST spinning drives are dead. 10k/15k drives are going to see the end of their useful life in the modern datacenter far faster than anyone predicted 2 years ago. Outside of legacy systems I would expect sales of 10k RPM drives to fall off a cliff if not completely disappear before the end of 2020.

Thanks. To add another layer to this, presumably SSD read/write is faster so there would be less time at max usage?
On paper, yes. But it really depends on the workload. At the end of the day, if you're architecting appropriately it's apple's and oranges. SATA/NL-SAS drives excel at large streaming workloads - think video rendering, storing large ISOs, video surveillance, database dumps.

SSDs are highly transactional workloads like databases or most back-end systems for applications or virtual machines.

You RARELY see the two used for the same type of workload unless someone has money to burn and wants to standardize on SSD and doesn't care about cost. SATA/NL-SAS being used for a workload that should be on SSD generally results in someone getting fired and the original system being forklift replaced.

Yes but when you are looking at Petabyte scale system, unless those data are Hot and IOPs are concerned, HDD still wins given it is 10x cheaper.

And NAND has already hit the curve where it isn't going to get cheaper every year. NAND price is actually on the rise. Smaller Node is now actually more expensive, multiple layer are hard to yield.

So relatively speaking the 10x gap between HDD and SSD wont change in the next 5 years or so.

It is not 10x cheaper, it is only cheaper to purchase initially.

It is cheaper to power (NAND is more energy efficient than an electric motor), it is cheaper to maintain (solid state media does not suffer mechanical failures), and it is cheaper to use (each query on an SSD takes slightly less time than on spinning media).

It's only 10x cheaper if you ignore those facts. Now, how you value those factors may vary. Also, I have been involved in enough purchasing decisions to know that while capex is easy to approve and opex is hard, the initial number is surprisingly important.

I think you overestimate the cost of electricity over the expected life of a drive. These arent space heaters. Pennies per day, adding up to perhaps tens of dollars difference between the two options.
I see, may be you should tell Blackblaze to switch over to SSD for their business model?
Backblaze is in the backup industry, where fetch times don't really matter. When you have customers sensitive to app response time buying stuff, or engineers limited by how many times they can go through the edit-test cycle in 8 hours and "test" relies on how fast your media responds, then it matters.
> the 10x gap between HDD and SSD

It's a 4 to 6x gap [1].

[1] http://www.pcmag.com/article2/0,2817,2404258,00.asp

1TB 2.5" HDD? You need to compare the sweet spot for SSD and HDD ( 3.5" ). And HDD easily outpace SSD, and 10x is only a generalization, in fact given the price of SSD is rising and HDD is slowing dropping, the sweet spot between the two is actually edging close to 20x soon.
Funny, the 8TB drives I was looking at for a new NAS in a few months are around $300... I'm still debating between a 4 or 5-drive nas... but lets say, including the cost of the nas device I'm paying around $2k for 24TB of reduntant storage solution. 24TB of ssd storage alone will cost me over $6k, let alone the cost of a more expensive base NAS box, meaning 7-8K. That's upwards of 4X the cost.

That's the difference between something I can cover from my tax return, to something I won't really even consider. I won't save $6k in power in a year, or 5 to make up the difference, and I don't need the extra speed, to feed media to my htpc.

I wouldn't use those 8Tb shingle drives for a NAS. In fact, that use is specifically mentioned as being not under warranty if I remember correctly.

Also, if (when) a drive fails and you have to resilver 8Tb of data on one of those drives, the slow write speed will kill you.

The WD Red drives are expressly for NAS usage. Also, I'm aware of the slow redistribution of data. Generally there's only 0-2 connections on the NAS I have now, I just want more room. It's mostly BD/DVD rips, so if it dies, I can recover, it just takes a while. Reduced performance for a few days or a week isn't a huge deal for me.
> The WD Red drives are expressly for NAS usage.

Oh, I was thinking about the cheaper Seagate "Archive" drives which have really slow write speed because of that shingled technology.

> Reduced performance for a few days or a week isn't a huge deal for me.

It could be a huge deal when another drive fails during that week while the array is rebuilding. But if you have more hot spares it's no big deal.

So if we're at only 3x the cost for SSD, how many years until HDD cost parity is reached? 5 years? This seems like a sensible guess to me.
Process shrinks are increasingly expensive, so cost wise, I wouldn't expect it for another 10+ years.. also, if you need storage today?
A spinning disk iirc draws something like 15w. That's essentially true whether you've got a 1TB drive or an 8 TB drive. And it's only that high while the drive is spun up.

They are bulkier, but ultimately an SSD is going to need space and servers too.

It's not a competition at this point. SSDs are good if you need low latency or high IOPS, HDDs are good if small delays and low IOPS is acceptable, and tape is good if you don't mind waiting several minutes to get the data.

TCO drops significantly from SSD to HDD, and significantly again from HDD to tape. And a large, modern datacenter will frequently have massive amounts of all three.

We're talking about a ~10x difference right now. Do your factors sizeably reduce that gap? By how much do you estimate?
Are SSD better per Watt ? I used to see no difference for laptops models in the past years. I didn't check recently though (non sata devices etc).
No spinning parts. They do use less energy but not enough to make them more expensive TCO to SSD.
Is that TCO? Because unless you're comparing with warranty, power, and human costs, it's not accurate.
You are correct in that hard drives are still an order of magnitude cheaper per unit of storage than solid state drives. Then again, tape drives are still an order of magnitude cheaper per unit storage than hard drives. When was the last time you saw a tape drive?
The tape industry is still billions of dollars per year. Enterprises still buy tape all the time, and tape technology is still iterating.

And tape is not even that competitve. It's like 1/3 the cost of HDDs, and has an 80 second seek time, making it completely unusable for most real time applications (still usable for theatrical movies).

Tapes are still used for enterprise backup, and for hierarchical storage systems (e.g. hot data on SSD, warm data on HDD, cold data migrates to tape eventually).
>tape drives are still an order of magnitude cheaper

Are they though? I just searched on Amazon and the cheapest LTO6 drive was $1,619.56.

Even just the tapes are $25 or so for "(2.5TB) native to (6.25TB) compressed" whatever that means which I guess is cheaper per TB if you don't worry about the drive.

https://www.amazon.com/Quantum-Ultrium-6-Drive-Height-Intern...

https://www.amazon.com/Sony-Linear-0-85-Inch-Internal-LTX250...

Where do you think Amazon Glacier data are stored?

There's no point storing that on a high-IOPS, always-on medium, like an SSD or even HDD.

Tapes are big, and still make complete sense for backup purposes, especially if you want to maintain latest 10-15 backup copies to choose from.

Not likely tape. Looks like evidence points to BXDL

https://storagemojo.com/2014/04/25/amazons-glacier-secret-bd...

The film industry commonly uses tapes for digital storage of their media. I've been told (but don't know this myself) that government uses tape for long term data storage as well.
I'm in film industry. Everything is baked onto LTO tapes. Everything. Three copies, three locations for backup. Always.
... under my desk.
Two days ago, when I toured a datacenter.
I think back to when HDD's were "getting cheaper" at $1/MB (and every power of 10 since at that same price point) and shrug. While they'll certainly still have their place (similar to tape drives and other storage technology that used to be much more commonplace), there's every reason to expect the PPU of SSD's to continue trending downward.
> there's every reason to expect the PPU of SSD's to continue trending downward

Not really. The big driver for cost reduction in SSDs was process shrinks, but the latest processes are actually more expensive per-transistor than older ones so that ship has more-or-less sailed. You've got 3D NAND but that's even more expensive to manufacture so the cost savings are marginal at best. It remains to be seen if XPoint will lead to big savings.

In the short term SSD prices have actually been trending up due to a shortage of NAND flash.

To my point, this is still the same argument that has been made time and time again with regards to storage. Short term upticks are easy to see in historical SSD pricing, even when looking at the longterm trend.

Source: http://www.jcmit.com/mem2015.htm

If you think this is the same you haven't been paying attention to what's happening in the industry. Moore's Law as we know it is dead, shrinking the node size isn't cutting costs like it used to. A lot of people are still in denial over this, but it's going to have major repercussions in the IC industry.
I'm not referencing Moore's Law at all, nor am I hinging my argument on a single piece of input. I'm simply using decades of historical data to claim that it's more likely to continue downward than upward. Would you mind providing some data points that explicitly or implicitly predict an upward trend over the next 5-10 years?
That is not true. If your content has a really long tail then you also likely also have a tiered storage where hot content is stored in the page cache and NAND(near storage) and much less frequently accessed is on HDD(far storage.)
All the high end and mid-range laptops have already dropped the spinning drive (unless they have a secondary drive for bulk storage), the user experience is just so much better. It's only the low-end laptops that still use spinning drives, because even a 1tb is technically cheaper than a reasonably sized SSD.

I took a look and the NAND chip trend lines, and I predict that by mid 2018, 256gb of NAND (which is enough space for an average consumer) will cost less than a 1tb drive. At that point, all the cheap laptops will drop spinning drives too.

Actually the very low end is all 32gb emmc now, I assume those are super cheap and Windows 10 is okayish with that space so the manufacturers went for it.

But then you jump to mechanical disks to get above 32 until like you said mid-range where it's all proper SSDs.

I bought one of those for Christmas that only had 16gb. 14 of those were taken up by win10! Win10 ran reasonably well actually, but I switched to lubuntu for the storage.
This sounds like we could go back to the workflow of pre hard drive systems. Windows on the internal drive, and all user file storage on SD cards.
I think until SSDs can hit the price point of a HDD at same capacity they will probably stay around atleast in consumer space.

A 2TB harddrive is just way to cheap compared to a 2TB SSD, even if they fail and need more power, atleast for archiving or large storage in consumer terms.

Primary storage may switch to SSD though.

I'm with you, but i believe we will see more and more huge user data sets being stored on clouds. So maybe the future will be a hybrid model between SSD local drives (for day to day files) and cloud for mass user storage (like photos, videos, backups, and so on).
People keep suggesting that the cloud will take over, but for most people the time required to upload all their 4k christmas videos to the cloud will be the limiting factor.

If everyone had fiber, sure...

Recently had a small cluster (12 nodes) where close to half the SSDs used for the root filesystems failed in a span of a few months.

Turned out to be a defective batch from manufacturer. SSDs are normally reliable but in my experience are not immune to problems.

Is correlated risk an area where HDDs have a fundamental advantage over SSDs? Un-diversified RAID arrays were notorious even when spinning rust was the only contender.
Perhaps this is old information, but do SSD's from the same batch/model not tend to fail all within the same timeframe? A steady stream of failures is probably preferable to a single mass failure.
Always mix batches almost regardless of what you buy. HDDs can and do fail like that too. I learned that the almost-hard way back in the day of the IBM Death Star [1], where we had a RAID that started failing, one drive at a time, roughly a week apart. We had reasonably up to date backups, but couldn't afford to take everything offline, so we were biting nails for weeks of continous RAID rebuilds and reduced performance and thankfully everything stayed up and we didn't get any additional failures during any of the rebuilds.

[1] https://en.wikipedia.org/wiki/Deskstar

Finding different batches can be difficult, especially if like most companies you buy your kit from one or two approved vendors.

You often hear the trope 'don't mix different disk brands in RAID' wondering if anyone knows if that's true?

They can. I had a set of Crucial SSDs which all contained the same firmware defect which took them offline after X hours of power-on time.

I also had a RAID 1 array where both SSDs failed within a couple days of each other (due to wear). That was a rude surprise. They were only six months old.

I used to write SSD firmware (not Crucial though!) and new code can always be buggy despite our best effort to test it thoroughly. However many of the SSD companies have carried their firmware through multiple generations now and the code has matured during that time so I expect this to be less of an issue. The bigger issue now will be a process shrink resulting in NAND issue that has not been identified before to be properly mitigated by the controller/firmware so I personally when a year for a product to mature before buying it.
Ah yes, the 5200 hours bug. That was fun to read about only a few days after plugging in my Crucial M4.
I went through 3 different Crucial M4s, all replaced under warranty due to unrecoverable read errors and subsequent data loss.

That model really had some serious issues at the time.

The benefit of ssd is that you can almost exactly predict when a ssd will fail.

You know the read and write IPOs. You know when they fail.

I dont know about that. Every SSD that has failed for me did so quite unexpectedly and prematurely, I haven't yet taken an SSD to its write limit and none of them have survived more than a few years except the very first Intel 80GB gen 1.
I had my first Intel 32/40gb fail just outside warranty (originally 1 year iirc). But I was so addicted to the speed difference I kept with them... The price difference isn't bad for laptop/desktop use now. But man, can't even consider it for my nas.
I've never had any storage medium fail on me, and I feel like I do a lot of read/writes.

Exception is a flash drive I snapped in half once. Oops.

Never had a failure on any of my personal laptops or PCs. I see it all the time though at work on our servers. I think what an individual considers a lot of read/writes is a drop in the bucket compared to what servers do.
Unless its the controller or firmware that goes bad... I once had three SSDs from the same batch (in different RAID sets, thankfully - we mix and match) go within a week and all had completely garbled SMART data.

But overall I prefer SSDs - just mix and match different models/manufacturers/batches in different RAID sets as for HDDs.