| CXL and its coherency mechanisms will be interesting to watch as the requirements for LLMs and related applications requiring large memory pools continue to grow. This includes some HPC related workloads also. One of the use cases I have seen recently was driving down the total cost of DRAM in a larger-scale deployment of systems at AWS, Azure, Meta etc. Pond [1] which is a memory pooling system, claims to achieve the desired performance and at the same time lowering costs as one example. I think a look at the overall, bigger picture is important. For example considering how a system will be combined together with multiple GPU's, memory systems and other accelerators to meet the demands of applications, consider interconnects like NVLink [3] too. For those interested, I have left a previous comment about experimenting with CXL on a local setup [0] [0] https://news.ycombinator.com/item?id=37944691#37948761 [1] Pond: CXL-Based Memory Pooling Systems for Cloud Platforms
https://arxiv.org/abs/2203.00241 [2] Intel Reveals the "What" and "Why" of CXL Interconnect, its Answer to NVLink
https://www.techpowerup.com/254462/intel-reveals-the-what-an... [3] NVLink and NVSwitch; The building blocks of advanced multi-GPU communication—within and between servers.
https://www.nvidia.com/en-us/data-center/nvlink/ |