Hacker News new | ask | show | jobs
by barrkel 3183 days ago
ECC is a market segmentation tool. Keeping ECC away from consumer hardware means you can charge more to make it available for industrial, professional etc. use.
3 comments

Many and CPUs support ECC. It will take a bit longer to find a motherboard that supports them, but they are out there. Consumer ECC is an option but few seem to know about or opt for it.

I want to say I had ECC in my old AMD K6-350, but that was a lot of years ago. Either way, I'm pretty sure they have supported it for a long time and in many of their CPUs, even the CPUs aimed at consumers.

Here's a comment I made last night, if you want to dig in and find some current offerings:

https://news.ycombinator.com/item?id=15406565

Interesting thought, but kind of sounds like a conspiracy theory. Anything else to back it up?
Do you see every market segmentation as a conspiracy? If so, a conspiracy between whom?
Obviously ECC requires more resources and is therefore intrinsically more expensive. That does not preclude it from being used for market segmentation to get higher margins.

Same reason ECC is not supported on consumer processors from Intel. Here it would be almost trivial to add, but they don't.

Simple business strategy.

You're probably aware, but the desktop i3 does support ECC. Well, at least through the 7th generation. It looks like now that it's added 2 cores in the 8th generation, ECC has been dropped (NB: I didn't look at all models).

So, they're okay with it on consumer parts, as long as you're not looking to do anything resembling professional work I guess.

In most previous generations all Core iX dice have had support for ECC, selectively disabled through fusing. Obviously, all internal busses and caches use error correction/detection independent of what the memory controller does.
> Obviously ECC requires more resources and is therefore intrinsically more expensive.

Technically. A fraction of a percent increase in die size for the memory transceivers, basically nothing for making and verifying checksums.

It's all business strategy.

> ECC requires more resources and is therefore intrinsically more expensive

manufacturing cost is non-linear. Obviously intrinsically word gunk.

Can you expand on this? Memory already includes controller logic and adding ECC logic does not need a seperate process. All you need is more silicon real estate because of the extra bits needed for the ECC. Silicon is cheap.
Not really. Most people don't care about ECC and gamers actually prefer the faster non-ECC ram. Since ECC is more expensive to make, the industry is obviously not going to sell ECC to people not willing to pay more for it.

And in a lucky coincidence it turns out that those who want ECC are willing to pay the moon for it, which of course gatekeepers like Intel exploit.

Well I don't think there's anything about ECC, or at least unbuffered ECC, that should make memory slower than non-ECC memory. "Fast" memory it seems to me is also just a marketing thing. If ECC was considered mainstream as it should be then I think we'd have overpriced '4000MHz Gaming X Raptor' ECC modules all the same. I'm also not convinced that unbuffered ECC makes memory modules significantly more expensive.

Basically there's no evidence that ECC memory is signficantly slower or more expensive than non-ECC memory. Obviously it's slightly more complex than non-ECC memory, but it's not a particularly high-tech addition. For all its supposed complexity/slowness, Samsung is using it in its caches on their Exynos SoCs.

Everything can be perfectly explained in terms of market segmentation.

Well ECC memory uses eight extra bits on the data bus, backed by a extra chip(s) (depending on the module's organisation); ECC memory modules effectively store one extra eighth of redundant data.

However, in many applications we find that using (forward) error correction almost always increases data density (for storage) or bandwidth (for transmission), simply because a FEC stream does not require a nearly-perfect channel any more. This is the way hard disks, SSDs, WiFi, LTE, DSL, ..., satellite communications, ...[, ...][, ...] are able to cram incredible amounts of data into very noisy channels. Thus, ECC significantly lowers cost in many dimensions (be it frequency spectra, storage prices, not having to re-cable entire countries...).

(And if you don't use the extra noise margin to increase density/bandwidth, then you can use it to increase reliability, like we usually do with ECC memory)

Thinking about it for a few minutes, the memory bus will most likely be the only bus in your computer that has no error correction/detection. USB, SATA, PCIe, all of them require it. The main memory will also most likely be the only storage that doesn't use it (apart from firmware flash chips and the like, but these often use a checksum at least).

> However, in many applications we find that using (forward) error correction almost always increases data density (for storage) or bandwidth (for transmission), simply because a FEC stream does not require a nearly-perfect channel any more.

Do you mean that ECC has those benefits, or that other applications of error correcting codes has them?

The reliability boost that ECC DRAM gives you could be reinterpreted as extra headroom for overclocking the DRAM before it becomes too unstable. Since the parity bits are carried on extra data lines, they aren't subtracting from your usable memory bandwidth so the net effect may be a substantial performance advantage when operating at equivalent reliability levels. The main concern is whether the memory controller can correct errors without a severe latency penalty. The ECC used for DRAM is far simpler than the LDPC used for things like SSDs, so it's probably not an issue. (However, systems halting on the detection of a double bit uncorrectable error would be an inconvenience.)
That it’s common in the other peripherals really says something.
You would figure that overclockers would like to know when they have memory bit errors. There is a large overlap between gamers and overclockers.
Your point seems to make sense in GPU lands as well. Nvidia offers ECC on their workstation class GPUs - quadro - but not on their gaming GPU - gtx.
Gaming GPUs have different qualification targets anyway. They will error fairly frequently compared to e.g. your CPU; something that doesn't matter when most errors become invisible after about 16 ms, but not really something you want to see e.g. in a simulation.
I'd take a slight slowdown in ram access if I knew for sure that my game won't bluescreen.