| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lend000 2178 days ago
	You can get an idea of how popular different processors are in the server space by looking at the AWS EC2 spot market. Top end Xeon server processors (C5 and Z1d) typically have much lower spot discounts than AMD EPYC based processors (r5ad), although ARM c6g instances have been pushed up in price significantly over the last few months, perhaps as people switch over to them for the per-computational-unit cost savings. Of course, this is all a factor of Amazon's supply of instances and their chosen on-demand pricing level, but the trends are certainly interesting, and show steady demand for fast Xeon's and increasing demand for ARM's. I have run some compute heavy workloads on the best AMD's I could find on AWS and the speed difference per core for my particular workload was nearly 50%, which got worse as it scaled up to bigger instances because my workload uses a lot of L3 cache. I hear about EPYC's with 256MB of L3 cache but I can't seem to find those on AWS -- only ones with 8MB of cache.

1 comments

_msw_ 2178 days ago

Disclosure: I work at AWS on building cloud infrastructure

C6g instances only launched on June 11. I'm not sure what information can be gleaned from the spot prices regarding Arm demand at this time.

The C5a instances powered by AMD Rome processors have 192 MiB of L3 cache per socket total (16 MiB L3 slice per compute complex, 12 CCX per socket). You can observe this from the cpuid(1) output:

   L3 cache information (0x80000006/edx):
      line size (bytes)     = 0x40 (64)
      lines per tag         = 0x1 (1)
      associativity         = 0x9 (9)
      size (in 512KB units) = 0x180 (384)

384 * 512 KiB = 192 MiB

(you can download cpuid from http://www.etallen.com/cpuid.html)

link

lend000 2177 days ago

Thanks for the info -- I must have misinterpreted the spot pricing history chart for c6g. While you're here, does the AWS hypervisor have any means to dedicate a portion of the L3 cache to each virtualized core, or is it a free-for-all for all of the cache space (such that a noisy neighbor could potentially be evicting data held in your L2 cache or even L1 cache by thrashing the L3 cache)?

link

_msw_ 2176 days ago

For instance families like C, M, and R, processor cores are dedicated to one instance, and the virtual processor is pinned 1:1 to the underlying logical processor. Therefore there is no neighbor that is able to use the L1 and L2 caches.

For L3 cache, we try to optimize for the best overall performance for the majority of the time. Smaller instance sizes share L3 cache with other instances. I wouldn't call it a "free for all" given some changes in how the cache hierarchy has been shifting over time (e.g., Skylake-SP L2 cache per core was increased, and the L3 cache is now 'non-inclusive')

link