Hacker News new | ask | show | jobs
by monocasa 347 days ago
There were single image systems with hundreds of cores in the late 90s and thousands of cores in the early 2000s.

I absolutely stand by the fact that Intel and AMD didn't pursue high core count systems until that point because they were so focused on single core perf, in part because Windows didn't support high core counts. The end of Denmark scing forced their hand and Microsoft's processor group hack.

2 comments

AMD and Intel were focused on single core performance, because personal desktop computing was the bigger business until around mid to late 2000s.

Single core performance is really important for client computing.

They were absolutely interested in the server market as well.
Do you have anything to say regarding NUMA for the 90s core counts though? As I said, it's not enough that there were a lot of cores - they have to be monolithically scheduled to matter. The largest UMA design I can recall was the CS6400 in 1993, to go past that they started to introduce NUMA designs.
Windows didn't handle numa either until they created processor groups, and there's all sorts reasons why you'd want to run a process (particularly on Windows which encourages single process high thread count software archs) that spans numa nodes. It's really not that big if a deal for a lot of workloads where your working set fits just fine in cache, or you take the high hatdware thread count approach of just having enough contexts in flight that you can absorb the extra memory latency in exchange for higher throughput.
3.1 (1993) - KAFFINITY bitmask

5.0 (1999) - NUMA scheduling

6.1 (2009) - Processor Groups to have the KAFFINITY limit be per NUMA node

Xeon E7-8800 (2011) - An x86 system exceeding 64 total cores is possible (10x8 -> requires Processor Groups)

Epyc 9004 (2022) - KAFFINITY has created an artificial limit for x86 where you need to split groups more granular than NUMA

If x86 had actually hit a KAFFINITY wall then the E7-8800 even would have occured years before processor groups because >8 core CPUs are desirable regardless if you can stick 8 in a single box.

The story is really a bit reverse from the claim: NT in the 90s supported architectures which could scale past the KAFFINITY limit. NT in the late 2000s supported scaling x86 but it wouldn't have mattered until the 2010s. Ultimately KAFFINITY wasn't an annoyance until the 2020s.