Close

Presentation

Discovering and Exploiting Untapped Buffer Resources in Many-Core DNN Accelerators
DescriptionIn large-scale DNN inference accelerators, the many-core architecture has emerged as a predominant design, with layer-pipeline (LP) mapping being a mainstream mapping approach. However, our experimental findings and theoretical justifications uncover a hardware-independent and prevalent flaw in employing layer-pipeline mapping on many-core accelerators: a significant underutilization of buffer space across numerous cores, indicating substantial potential for optimization. Building on this discovery, we develop a universal and efficient buffer allocation strategy, BufferProspector, which includes a Buffer Requirement Calculator and Buffer Allocator, to capitalize on these unused buffers, addressing the timing mismatch challenge inherent in LP mapping. Compared to the state-of-the-art (SOTA) open-source LP mapping framework Tangram, BufferProspector averages a simultaneous increase in energy efficiency and performance by 1.44x and 2.26x, respectively. BufferProspector will be open-sourced.
Event Type
Research Manuscript
TimeWednesday, June 252:00pm - 2:15pm PDT
Location3001, Level 3
Topics
AI
Tracks
AI3: AI/ML Architecture Design