Hacker News new | ask | show | jobs
by ChuckNorris89 1496 days ago
I did not see any temperature measurements of the controller though in this article so the test doesn't feel very scientific to me.

It's not just thermal throttling of the controller that causes slowdown, it's also the filling of the DRAM/SLC cache.

Also, we should talk more about some vendors screwing over customers by replacing controllers or NAND chips with slower parts to cut down cost while keeping the same SSD SKU, after seeding the original SKUs to reviewers to lock in good benchmark scores in online tests. This is why I recommend only buying SSDs from reputable vendors/OEMs who are more vertically integrated: Samsung, WD, Sandisk, Micron.

7 comments

The tested SSD indeed has an SLC cache which likely filled and the author misinterpreted the cause of the throttling as thermal.
The only meaningful way to prove thermal throttling (in any part) is to cool it down and see if that corrects the problem. This author did not do that. Correlation is not causation.
And cooling down in this case has to be accomplished by something other than letting the drive sit idle for a long time, because that same idle time can be used by the drive to empty the cache.
Letting it cool down and cooling it down are different things. You have to run it at load, let it get hot, then cool it down without removing the load. Do nothing else but spray some liquid N2 at it and watch to see if it speeds up.
> You have to run it at load, let it get hot, then cool it down without removing the load

Sustained load testing can reveal multiple phase changes in performance unrelated to temperature, which can complicate results from a single run (eg. what if you run out of spare blocks before the drive has cooled below the hysteresis threshold to disable throttling). So multiple independent runs starting from the drive in the same state and varying only the cooling method is the most controlled and reliable methodology.

That's it I would say.
Besides what you've mentioned, overprovisioning/garbage collection may also cause slowdowns after sustained writes.

The more I've worked with SSDs to try to get them to perform well, the more I've come to realize benchmarking SSD performance is virtually impossible to do in any meaningful sense, because they themselves are stateful and your performance is highly dependent not only on what you are doing now, but what you did a few moments ago. There's also a bunch of protocol-level behavior adding to this, as well as os-level behavior that may be difficult to isolate, and even if you succeed, yeah, you're benchmarking the device in an unrealistic fashion unlike any actual real world usecases so congrats I guess.

Typically there are a few primary modes of the state. It’s useful to understand the performance under each mode. Even if you want predict exactly what mixture of modes you’ll be in, you can have a good guess if you’ll be in some pathological behavior based on your workload.
Are there any programs that can track temperature for different vendors? I'd love to run a few ssds through a benchmark and have hard data for thermal data.
Temperature data is part of SAMRT so there's a plethora of utilities on any OS. For windows the best is HWInfo, IMHO.

Ideally, for a proper test, I'd expect thermocouples placed on the controller and not just rely on the sensor data provided by the SSD as that could be very misleading depending on where the senor is and how the raw sensor data is processed.

arg, you are right, smart would be the easiest way to go. And the thermocouples are a great idea. I'll see what I can whip up
sudo nvme smart-log -H /dev/nvme0

Will show all temperature sensors on the device, there are usually several. It also will show how often the composite temperature crossed the warning and critical thresholds in the lifetime of the device, and how long the device spent above those thresholds.

You can read the temperature from SMART interface.
Only if the interface exposes it. USB ones in my experience outright don't support it or mess it up somewhere in the chain; the only place I got SMART reliably to work was with a direct SATA attachment to a controller.
If you have a desktop PC and want to make the process of connecting external drives to SATA for this purpose a bit easier, you can add an eSATAp bracket to your PC.
You can also slap a thermal sensor to the SSD. Or use contactless thermometer.
openhardwareMonitor https://openhardwaremonitor.org/ I'm windows
Thermal cameras are surprisingly cheap now
Samsung was one of the vendors doing this... there was some controversy about them changing the specs on the datasheet without changing the SKU. This was for Samsung-branded drives, one of the 9xx models iirc.
Samsung did update their SKUs. Link: https://www.tomshardware.com/news/samsung-is-swapping-ssd-pa...

I'd advise against throwing accusations without proof. Looking for reliable hardware vendors is hard enough, there is no need for people unintentionally muddying the waters.

The marketing name still seems to be the same according to the article, so it would still fail any reasonable consumer confusion test.
Have you checked the performance difference or are you just venting based on click-bait articles?

According to benchmarks, the difference in performance is negligible, in fact the new SKU is sometimes better than the old one in most benchmarks, so there's really nothing to worry about.

You're grasping at straws here.

Doesn't sound negligible to me:

> In longer tests, both drives decrease sharply in performance as cache fills, which is expected. But while the older drive retains nearly two-thirds of its original performance, the newer version craters to less than a third. We can see this effect not only in artificial benchmarks, but also in large file copies [1]

While changing SKU is one step better than other manufacturers, it's still a shaddy behavior keeping the old name.

[1] https://arstechnica.com/gadgets/2021/08/samsung-seemingly-ca...

This was the case with ADATA.

They had a lot of SKUs (6) for one part, and despite on paper some of the numbers being lower, and in synthetic benchmarks the drives performing differently, some of the “slower” parts actually performed better than the original reviewed/released samples.

This doesn’t excuse what some people consider bait and switch. But it’s more complicated than many outraged gamers made it seem.

Then they shouldn't have anything to fear from being honest about the situation, releasing it as the "second edition" or something, and letting it go through the regular review pipeline like any other new product or revision.
Considering how WD did this with switching CMR drives to SMR I wouldn't trust them with SSDs either.
ADATA caused some of this by having over 6 builds for the same SKU with different performance metrics.

Their latest SSD (I think), S70, has a marketing bullet point that all S70 parts are built with the same components. Should be unnecessary, but here we are.

I’ve basically never bought anything but the Samsung Pro drives and never once been disappointed.

I have observed thermal throttling on a plastic Sandisk SSD poorly located in my case. It has nothing to do with usage and is 100% a thermal issue.