CXL Gains Traction at FMS 2024

The CXL consortium has maintained a regular presence at FMS, which rebranded itself from ‘Flash Memory Summit’ to the ‘Future of Memory and Storage’ this year. Back at FMS 2022, the consortium had announced v3.0 of the CXL specifications. This was followed by CXL 3.1’s introduction at Supercomputing 2023. Initially developed as a host-to-device interconnect standard, it has gradually subsumed other competing standards such as OpenCAPI and Gen-Z. Consequently, the specifications began to encompass a wide range of use-cases by building a protocol on top of the ubiquitous PCIe expansion bus. The CXL consortium includes heavyweights like AMD and Intel, as well as numerous startup companies aiming to carve out niches in various device-side segments. At FMS 2024, CXL had a prominent presence in the booth demonstrations of many vendors.

CXL Memory Hierarchy

The transition from DDR4 to DDR5 server platforms, coupled with the rise of workloads requiring substantial RAM capacity (yet not particularly sensitive to memory bandwidth or latency), has catalyzed the emergence of memory expansion modules as the first widely available CXL devices. Over the past few years, we’ve seen product announcements from Samsung and Micron in this domain.

SK hynix CMM-DDR5 CXL Memory Module and HMSDK

At FMS 2024, SK hynix showcased their DDR5-based CMM-DDR5 CXL memory module with a 128 GB capacity. Additionally, the company detailed their Heterogeneous Memory Software Development Kit (HMSDK) – a collection of libraries and tools at both the kernel and user levels designed to facilitate the use of CXL memory. This is partly achieved by examining the memory hierarchy and dynamically relocating data between the server’s main memory (DRAM) and the CXL device based on usage frequency.

SK hynix CMM-DDR5 CXL Memory Module

The CMM-DDR5 CXL memory module is designed in the SDFF form-factor (E3.S 2T) with a PCIe 3.0 x8 host interface. The module employs 1α technology DRAM and promises DDR5-class bandwidth and latency within a single NUMA hop. Targeting datacenter and enterprise applications, the firmware includes features for RAS (reliability, availability, and serviceability), secure boot, and other management capabilities.

SK hynix also demonstrated Niagara 2.0—a hardware solution based currently on FPGAs to facilitate memory pooling and sharing. This system connects multiple CXL memories to allow different hosts (CPUs and GPUs) to optimally share their capacity. While the previous version enabled capacity sharing, the latest version supports data sharing as well. SK hynix had introduced these solutions at CXL DevCon 2024 earlier this year, and further progress was evident in the finalization of the specifications of the CMM-DDR5 at FMS 2024.

Microchip and Micron Demonstrate CZ120 CXL Memory Expansion Module

Micron had unveiled the CZ120 CXL Memory Expansion Module last year, leveraging the Microchip SMC 2000 series CXL memory controller. At FMS 2024, Micron and Microchip demonstrated the module on a Granite Rapids server.

Microchip and Micron Demonstration

Additional insights into the SMC 2000 controller were also provided.

SMC 2000 Controller

This CXL memory controller incorporates DRAM die failure handling, along with diagnostics and debug tools to analyze failed modules. The controller also supports ECC, part of the enterprise-class RAS features of the SMC 2000 series. Its flexibility ensures that SMC 2000-based CXL memory modules utilizing DDR4 can complement the primary DDR5 DRAM in servers configured to support only the latter.

Marvell Announces Structera CXL Product Line

A few days before FMS 2024, Marvell had announced a new CXL product line under the Structera tag. At FMS 2024, we had an opportunity to discuss this new line with Marvell and gather additional insights.

Marvell Structera CXL Product Line

Unlike other CXL device solutions focusing solely on memory pooling and expansion, the Structera product line integrates a compute accelerator alongside a memory-expansion controller. These components are built on TSMC’s 5nm technology.

Structera Product Line

The compute accelerator, the Structera A 2504 (A for Accelerator), is a PCIe 5.0 x16 CXL 2.0 device featuring 16 integrated Arm Neoverse V2 (Demeter) cores at 3.2 GHz. It includes four DDR5-6400 channels, supporting up to two DIMMs per channel, with in-line compression and decompression capabilities. Integrating powerful server-class ARM CPU cores allows the CXL memory expansion part to scale the available memory bandwidth per core while also enhancing compute capabilities.

Structera A 2504

Applications like Deep-Learning Recommendation Models (DLRM) can leverage the compute capabilities available within the CXL device. The increase in bandwidth availability also comes with reduced energy consumption for workloads. This approach furthers disaggregation within servers, contributing to a more efficient thermal design overall.

DLRM

The Structera X 2404 (X for eXpander) is available as a PCIe 5.0 device (single x16 or two x8) with four DDR4-3200 channels (up to 3 DIMMs per channel). Features like in-line (de)compression, encryption/decryption, and secure boot with hardware support are also present in the Structera X 2404. Compared to the 100 W TDP of the Structera X 2404, Marvell expects this part to consume approximately 30 W. Its primary purpose is to allow hyperscalers to repurpose DDR4 DIMMs (up to 6 TB per expander) while enhancing server memory capacity.

Structera X 2404

Marvell also offers the Structera X 2504, which supports four DDR5-6400 channels (with two DIMMs per channel for up to 4 TB per expander). Other features remain identical to those of the DDR4-recycling part.

Marvell highlighted some unique aspects of the Structera product line – the inline compression optimizes available DRAM capacity, and the 3 DIMMs per channel support for the DDR4 expander maximizes the DRAM per expander (compared to competing solutions). The 5nm process reduces power consumption, and the parts support access from multiple hosts. The inclusion of Arm Neoverse V2 cores is a first for a CXL accelerator, allowing the delegation of compute tasks to improve overall system performance.

While Marvell has announced specifications for the Structera parts, it seems that sampling is still a few quarters away. Notably, Marvell’s recent roadmaps and announcements have focused on creating products tailored to the needs of high-volume customers. The Structera product line is no exception—hyperscalers are eager to recycle their DDR4 memory modules and await the expander parts eagerly.

CXL is beginning its gradual ramp-up, with the steep growth segment still some time away. Nonetheless, as more host systems with CXL support get deployed, products like the Structera accelerator line become valuable for enhancing server efficiency.

Recent Posts

Categories

Gallery

Scroll to Top