Presentation
LLM-Driven TPU Design: Crafting Custom Tensor Processing Accelerators with APTPU-Gen
DescriptionThe increasing complexity and scale of Deep Neural Networks (DNNs) necessitate specialized tensor accelerators, such as Tensor Processing Units (TPUs), to meet various computational and energy efficiency requirements. Nevertheless, designing optimal TPU remains challenging due to the high domain expertise level, considerable manual design time, and lack of high-quality, domain-specific datasets. This paper introduces APTPU-Gen, the first Large Language Model (LLM) based framework designed to automate the approximate TPU generation process, focusing on systolic array architectures. APTPU-Gen is supported with a meticulously curated, comprehensive, and open-source dataset that covers a wide range of spatial array designs and approximate multiply-and-accumulate units, enabling design reuse, adaptation, and customization for different DNN workloads. The proposed framework leverages Retrieval-Augmented Generation (RAG) as an effective solution for a data-scare hardware domain in building LLMs, addressing the most intriguing issue, hallucinations. APTPU-Gen transforms high-level architectural specifications into optimized low-level implementations through an effective hardware generation pipeline. Our extensive experimental evaluations demonstrate superior performance, power, and area efficiency of the designs generated with minimal deviation from the user's reference values, setting a new benchmark to drive advancements in next-generation design automation tools powered by LLMs.
Event Type
Networking
Work-in-Progress Poster
TimeMonday, June 236:00pm - 7:00pm PDT
LocationLevel 2 Lobby