Presentation
PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System
DescriptionData deduplication enhances storage efficiency through non-destructive compression but is often hindered by the chunking process, which requires scanning the entire dataset. While traditional methods leveraging conventional architectures and hardware accelerators (e.g., GPUs and FPGAs) have been developed to address this issue, they continue to face challenges related to excessive data movement and associated performance degradation. These limitations stem from the von Neumann architecture, where computation and storage are separated in a processor-centric design, necessitating multiple memory hierarchy traversals and causing inefficiencies. To overcome these challenges, we explore UPMEM's DPU, a processing-in-memory (PIM) technology that reduces data movement by performing computations directly within memory. However, designing a deduplication system for DPUs presents unique obstacles, including restricted inter-DPU data sharing, the absence of native multiplication support, and significant DPU-CPU communication overhead. In response, we propose PIMDup, a DPU-optimized deduplication system that addresses these constraints through efficient parallelization, DPU-friendly chunking techniques, and reduced data transfer volumes. Experimental results demonstrate that PIMDup improves chunking performance without compromising deduplication accuracy, achieving a 1.67× speedup over CPU-based systems while maintaining 100% result consistency.
Event Type
Research Manuscript
TimeTuesday, June 244:30pm - 4:45pm PDT
Location3001, Level 3
Design
DES2B: In-memory and Near-memory Computing Architectures, Applications and Systems