Close

Presentation

CaMDN: Enhancing Cache Efficiency for Multi-tenant DNNs on Integrated NPUs
DescriptionWith the development of DNN applications, multi-tenant execution on a single SoC is becoming a prevailing trend. Although methods are proposed to improve multi-tenant performance, the impact of shared cache is not well studied. This paper proposes CaMDN, an architecture-scheduling co-design to enhance cache efficiency. Specifically, a lightweight architecture is proposed to support NPU-controlled regions inside shared cache to eliminate unexpected cache contention. A cache scheduling method is proposed to improve shared cache utilization, including cache-aware mapping and dynamic cache allocation. CaMDN reduces memory access by 33.4% and achieves a model speedup of up to 2.56x (1.88x on average).
Event Type
Research Manuscript
TimeMonday, June 233:30pm - 3:45pm PDT
Location3002, Level 3
Topics
Design
Tracks
DES1: SoC, Heterogeneous, and Reconfigurable Architectures