World’s first hybrid CXL device combines flash memory and DRAM — storage tiering comes to remote memory over PCIe

by Pelican Press
63 views 4 minutes read



World’s first hybrid CXL device combines flash memory and DRAM — storage tiering comes to remote memory over PCIe

Samsung has unveiled a new Compute Express Link (CXL) Add-in Card called the CXL Memory Module-Hybrid for Tiered Memory (CMM-H TM), which adds additional RAM and flash memory that can be remotely accessed by CPUs and accelerators. The expansion card comes with a mixture of high-speed DRAM and NAND flash and is intended to provide a cost-effective way to boost memory capacity for servers without using locally installed DDR5 memory, which often isn’t an option in oversubscribed servers. 

Samsung’s solution runs on the Compute Express Link (CXL), an open industry standard that provides a cache-coherent interconnect between CPUs and accelerators, thus allowing CPUs to use the same memory regions as connected devices utilizing CXL. The remote memory, or in this case, a hybrid RAM/flash memory device, is accessible over the PCIe bus, which comes at the cost of ~170-250ns of latency, or roughly the cost of a NUMA hop. CXL was introduced in 2019 and is in its third revision, featuring PCIe 6.0 support.

The CXL spec supports three types of devices: Type 1 devices are accelerators that lack local memory, Type 2 devices are accelerators with their own memory (like GPUs, FPGAs, and ASICs with DDR or HBM), and Type 3 devices consist of memory devices. The Samsung device falls into the Type 3 category. 

CMM-H TM is an offshoot of Samsung’s CMM-H CXL memory solution. Samsung says it is the world’s first FPGA-based tiered CXL memory solution and is designed to “tackle memory management challenges, reduce downtime, optimize scheduling for tiered memory, and maximize performance, all while significantly reducing the total cost of ownership.”

This new CMM-H isn’t as fast as DRAM; however, it adds a beefy slab of capacity via the flash but hides a lot of latency with a clever memory caching feature built into the expansion card. Hot data is moved to the card’s DRAM chips to improve speed, while less used data is stored in NAND storage. Samsung says this behavior happens automatically, but some applications and workloads can give the device hints to improve performance through an API. Naturally, this will add some latency for cached data, which isn’t ideal for all use cases, particularly those that rely on tight 99th percentile performance. 

Samsung’s new expansion card will provide its customers with new ways to expand their server’s memory capacity. This new design paradigm is becoming more important as more advanced large language models continue to demand more memory from their host machines and accelerators. 





Source link

#Worlds #hybrid #CXL #device #combines #flash #memory #DRAM #storage #tiering #remote #memory #PCIe

You may also like