|
|
|
|
|
by ecksii
2442 days ago
|
|
While Linux would have certainly killed Coherent eventually, that's
not quite the case. First Coherent was out long before Linux. Coh was
around long enough for AT&T to have sent Dennis Ritchie to Chicago to
inspect the code and evaluate it for copyright claims. Coherent ran on
the PDP11 and on the 80286. Linux became a real force in the Unix
market around 1998. MWC went out of business in Feb 1995. The first
round of layoffs at MWC happened in Oct, 1994. A perfect storm of several things killed Coherent The two biggest
problems were: The customers dinging Mark Williams Company in the newsgroup mainly
complained about the lack of TCP/IP networking. This happened because
MWC had done a customer pole to see what big feature should come next?
TCP/IP or X11. X11 won. The real or perceived drop in quality of the product. This one is hard
to explain. Coherent 3.10 and 4.0 had been solid V7 Unix clones with
V7 sensibilities. When 4.2.05 shipped it included a really nasty disk
driver bug that basically destroyed your file system beyond the
ability of fsck to fix. The bug was triggered when your drive when
into a very common thermal recalibration mode. This mode was rare or
hadn't existed during the days of MFM/RLL/ESDI drives but became
common with ATA drives especially as the market got flooded with cheap
504MB drives. While the bug was fixed somewhere between 4.2.10 and
4.2.14, the damage to Coherent's reputation was done. |
|
As the person responsible (alas), my specific recollection of this particular bug was that the root cause wasn't thermal recalibration, but rather UDMA signalling errors.
Prior versions of Coherent using PIO mode had excruciatingly slow access, and when adding support for UDMA I also added support for the disk driver to recognise sequential access and issue multisector transfer requests; this boosted performance fairly massively, something like 3-4 times for some common things, and it was run for a fairly long time in-house and by beta testers with no trouble before it shipped.
The problem though, was a small - literally one line - arithmetic error when the drive end of things reported a UDMA transfer error had occurred in the middle of a multisector operation; the error-handling code that set up a retry of the operation didn't compute the start kernel address correctly when a whole bunch of transfers had been merged (and some subset had worked).
The primary problem with the UDMA modes was sensitivity to correct cable termination - see https://en.wikipedia.org/wiki/Parallel_ATA#Cable_select for some of that; basically, signal reflections from parallel ATA cable runs that didn't have terminating resistors made things electrically marginal and some systems would have really excessive numbers of UDMA CRC faults as a consequence, and given sufficiently high error rates and really bad timing that could end up polluting the buffer cache with stuff that was skewed by a sector :-(
The big thing (on top of not having any in-house hardware that triggered this specific bug) was the sheer volume of work required for those releases, since getting from what was basically a fairly vanilla Seventh Edition UNIX to where it needed to be to start running large pieces of third-party code expecting POSIX was a big lift. Since there weren't many people, everyone was having to wear lots of hats; for instance, aside from kernel work I did a huge amount of work for POSIX.1 and .2 compatibility and on top of doing the underlying code changes (which ranged all over the system, particularly for some of the stuff we ran into Autotools scripts relying on) all of those needed documenting, too.
[ Fred Butzen did amazing work writing the actual manpage text and making it really easy to understand - he justly deserved the credit for the quality of the manual in terms of its readability. But the scale of the changes needed to bring so many parts and pieces from V7 to POSIX meant lots and lots and lots of work trying to iterate over docs for technical accuracy at the same time as having to redesign all the affected parts and pieces. It was, in a word, exhausting. ]