Presentation
Flexible Integration of a Neural Network Library in TensorFlow Lite Framework for Efficient Programmable Near-Memory Computing Architectures
DescriptionThis paper introduces a new exploration methodol-ogy for Near Memory Computing (NMC) architectures to addressthe need of accurate performance evaluation. We propose aprogramming and compilation tool chain dedicated to flexiblyprogram and generate binary code for a NMC-based system.We develop a new NMC-based library in TensorFlow Lite(TFLite) framework that groups a subset of Neural Network(NN) kernels optimized to respect the specifications of the targetNMC architecture. We use the proposed tool chain to compile thisNMC-based library in TFLite and to run CNN applications suchas image classification, person detection and natural languageprocessing. Our methodology uses a QEMU-based plugin modelof the NMC architecture. This modeling enables accurate designspace exploration to support hardware and software co-designfor NMC systems. Consequently, we can determine the optimalmemory size for the NMC block to maximize performance gainscompared to purely scalar architectures
Event Type
Engineering Presentation
TimeTuesday, June 244:45pm - 5:00pm PDT
Location2008, Level 2
AI
Systems and Software


