Close

Presentation

Adora: An Arithmetic and Dynamic Operation Reconfigurable Accelerator using In-Memory Look-Up Tables
DescriptionIn and near memory computing has been shown to offer significant performance benefits to data-centric applications due to it alleviating the memory-wall problem. Existing processing in memory (PIM) works aim to accelerate the transformers within Large Language Models (LLMs), which are made up of purely arithmetic sequences. However, the majority of in-memory accelerators lack the ability to support branching operations. Here an enhancement to a previous Look-Up Table (LUT) based PIM architecture to support branching instructions is demonstrated. The enhanced design allows for the implementation of more complex branching kernels. While "tokens-per-second" has become the defacto unit of measurement for the performance of language model accelerators, many accelerator designs are limited to implementing the arithmetic component of the network. Existing accelerators leave an inherent bottleneck by the general-purpose computing unit which completes the initial stage of translating the input data into its symbolic representation for use in the network body, and then the transfer of data to the accelerator. The presented design provides a solution to these restrictions, with performance comparisons for power and throughput with respect to conventional processors.
Event Type
Networking
Work-in-Progress Poster
TimeSunday, June 226:00pm - 7:00pm PDT
LocationLevel 3 Lobby