Presentation
CXL-ECC: an Efficient LRC-based on-CXL-Memory-eXpander-Controller ECC to Enhance Reliability and Performance of DRAM Error Correction
DescriptionCompute eXpress Link (CXL) offers an effective interface for connecting CPUs with external computing and memory devices. CXL Memory eXpander Controller (CXL-MXC) is gaining attention for its ability to boost memory capacity and bandwidth more efficiently than traditional DDR DIMMs. Despite extensive research on MXC performance and adaptation, DRAM reliability in CXL architecture remains under explored. Traditional fault tolerance mechanisms like replica or RAID-based systems would significantly increase bandwidth overhead in the CXL fabric, adversely affecting system performance. To address this, we propose the on-CXL-Memory-Expander-Controller ECC (CXL-ECC), by using Locally Recoverable Codes (LRC) as the Inter-Channel-ECC (IC-ECC) and offloading its process to the expander, we eliminate extra memory access requests in the CXL fabric.
Consequently, we conduct several experiments to demonstrate that our approach enhances DRAM reliability by more than $10^9$, compared to state-of-the-art ECC methods. Relative to RAID-enabled CXL switch, it reduces additional bandwidth overhead from 63.5\% to 3.4\% and improves system performance by 12\%.
Consequently, we conduct several experiments to demonstrate that our approach enhances DRAM reliability by more than $10^9$, compared to state-of-the-art ECC methods. Relative to RAID-enabled CXL switch, it reduces additional bandwidth overhead from 63.5\% to 3.4\% and improves system performance by 12\%.
Event Type
Research Manuscript
TimeWednesday, June 253:45pm - 4:00pm PDT
Location3008, Level 3
Systems
SYS6: Time-Critical and Fault-Tolerant System Design
Similar Presentations


