Hacker News new | ask | show | jobs
by rolenthedeep 1288 days ago
I used to work for Sherwin-Williams. The in-store computers run some custom *nix OS. The software that company runs on is a text based ui that hasn't changed since it was introduced in the 90s.

They released a major update in 2020 that allowed you to move windows around the screen. It was groundbreaking.

But let me tell you, this system was absolutely terrible. All the machines were full x86 desktops with no hard drive, they netbooted from the manager's computer. Why not a thin client? A mystery.

The system stored a local cache of the database, which is only superficially useful. The cache is always several days, weeks, or months out of date, depending on what data you need. Most functions require querying the database hosted at corporate HQ in Cleveland. That link is up about 90% of the time, and when it's down, every store in the country is crippled.

It crashed frequently and is fundamentally incapable of concurrent access: if an order is open on the mixing station, you cannot access that order to bill the customer, and you can't access their account at all. Frequently, the system loses track of which records are open, requiring the manager manually override the DB lock just to bill an order.

If a store has been operating for more than a couple of years, the DB gets bloated or fragmented or something, and the entire system slows to a crawl. It takes minutes to open an order.

Which is all to say it's a bad system that cannot support their current scale of business.

4 comments

I know of a small shop that made a good living buying software packages like that, rewriting/modernizing the technology, and selling the new version back to all the existing customers. It was kind of a win for everyone -- the customers got updated, secure software, the owners of the old tech who were getting nothing out of it (this stuff is not SaaS) got some money, and some developers got work.
That sounds like a great gig tbh
That does sound like an absolute clusterfuck of software. But just in case it wasn't clear, I don't think that "custom *nix OS" is to blame at all. And as for text based UIs, they're still fantastic for things like data entry and lookup.

It just seems that the actual software running on the OS, with the text UI, seems to be profoundly terrible in your case.

Yeah, the OS itself was passable. It had the absolute bare minimum required to work, so not much could go wrong.

The ancient software also wasn't bad. After a few months learning the hotkeys and menu structure, the speed with which you can enter and process data was absolutely incredible. It had problems, but usually minor and patched in a reasonable time for corporate IT.

The real problem was their database management. I don't have any information, so I'm assuming here, but my impression is that they're using some positively ancient database software. Doing a backup of the local cache took multiple days, though it didn't lock the DB. Requests to HQ were incredibly slow, about 30 seconds to pull an account record. Larger queries like neighboring store inventory took a minute or two. Running a report on local inventory would regularly take tens of minutes, and it only had to read the local cache.

The database was a few tens of GB on disk. Granted, I don't know much about databases, but if running something like "SELECT * FROM inventory WHERE sales < 100 ORDER BY lastSaleDate" on a 30gb database takes 15 minutes, something is wrong.

There were a lot of problems we ran into on a daily basis, and almost all of them related to database functions. Particularly when a record failed to unlock, sometimes we'd have to reboot the local server, which caused all terminals in the store to reboot. That usually took a good 15 minutes.

Personally, I rather enjoyed not having Windows at work. For the most part, everything Just Works, and given the hardware, it ran ten times faster than windows would have.

My current job is a Windows development shop, and I don't have enough curses to describe the pure rage I feel every time windows does something stupid (which is approximately every three hours).

Ugly. This makes me seriously wonder whether they just did not put an index in, and unfathomably many hours have been wasted on useless full table scans for something that would have been fixed with a handful of CREATE INDEX statements? Though that's a lot of conjecture, and the real answer is probably more complex. But your examples do make me wonder...
I remember running some very large reports over multiple years of inventory movement. Disk access on the server was totally saturated for a good 20-30 minutes.

Thinking about it now, it had to have read out the entire database multiple times.

Oh yeah, these reports weren't processed on the server, either. The network link on the terminal I used would be pegged at the max rate the server could read from the disk. I never really figured out what that's about.

I guess it's trying to stream large chunks of the DB to the terminal and running the query locally? No clue.

Some genius probably thought processing queries locally would be faster than on the database server. Fail.
Likely wasn't a RDBMS at all, but flat files with maybe one key index (or perhaps none).
I really doubt that a schema like that would survive at the kind of scale SW operates at. I'm about 80% sure I saw mention of database operations in the startup/shutdown logs, but I could be misremembering.

My guess is that they bought whatever database software was popular in the early 90s and never changed.

I do know they've been slowly changing the schema over the years, increasing the number of digits in the account number, adding email fields, that kind of thing. But I doubt there's been any major upgrades.

Well early 1990s database sofware wasn't awful. Talking about stuff like Sybase 4.x, roughly equivalent to early MS SQL server, also Oracle, Informix, DB/2, etc. Indexes, query planning (perhaps with hints), cursors, concurrency, were all adequately solved problems by then.
I worked for a vendor that SW was a customer, and we were asked about integrating with some SW systems...what you are describing resonates
Was it EDI? I had a problem with a vendor where my store ordered something in 2015, marked it as not received, then received it later and sold it without correcting inventory.

The vendor went into a cycle of refunding and re-billing my store for that part every few months for years.

Fortunately both our books came out even in the end, but Jesus what a stupid thing to happen.

It was ultimately all of SW. Before the Valspar purchase, SW used SugarCRM, and my team there (TAMs) were with the account and regularly out in CLE. I really like the team there from SW. One of the best customers I ever dealt with overall.
That explains why they had me drive across town only to find out another store very close had the color I need.
Oh no, that's an entirely different problem. Because of supply chain issues, stores are allocated product based on how much of that product they sell. That sounds like it makes sense, but really what it means is that smaller stores slowly get less and less product until they can no longer meet local demand. There is no way to break the cycle.

This is half of the reason everyone who worked at my store walked out on the same day. The other half is that the only people who worked there were me and the manager, and it had been that way for six months.

My advice is to avoid SW these days and go to Lowe's. SW is contractually obligated to ensure that Lowe's always has inventory. But do spend a little extra money for their mid-teir product. The cheapest stuff is trash and you will regret it.