Presentation
Emerging Technology Applications on Personalized Edge LLMs
DescriptionEdge-based Large Language Models (edge LLMs) can preserve the promising abilities of LLM while ensuring user data privacy. Additionally, edge LLMs can be utilized in various fields without internet connectivity constraints. However, edge LLMs face significant challenges in training, deployment, and inference. Limitations in memory storage, computational power, and data I/O operations can hinder the deployment of advanced LLMs on edge devices. These constraints often result in poor performance in customization, real-time user interaction, and adaptation to novel situations. Traditional acceleration methods, primarily designed for advanced computation platforms, may not be optimal for all types of edge devices. As a complementary solution, Compute-in-Memory (CiM) architectures based on emerging non-volatile memory (NVM) devices offer promising opportunities. These architectures, having demonstrated numerous advantages in traditional neural networks, can help overcome the computational memory bottleneck of edge devices and reduce competition for core computational resources. Through the introduction of software-hardware co-design and co-optimization methods, NVCiM can significantly enhance edge LLM performance in resource-limited environments. Moreover, NVCiM-based edge LLM systems are more cost-effective compared to LLMs running on high-performance computing devices. This makes them suitable for various personalized applications, particularly in healthcare and medical fields.
Section 1: Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices (Jinjun Xiong)
Section 2: Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis (Yiyu Shi)
Section 3: Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures (Yiyu Shi)
Section 4: NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs (Yiyu Shi)
Section 5: Tiny-Alignment: Bridging Automatic Speech Recognition with LLM on Edge (Jinjun Xiong)
Section 6: Analysis: Do Edge Large Language Models Allow Fair Access to Healthcare for
Language- Impaired Users? (Yiyu Shi)
Section 1: Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices (Jinjun Xiong)
Section 2: Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis (Yiyu Shi)
Section 3: Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures (Yiyu Shi)
Section 4: NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs (Yiyu Shi)
Section 5: Tiny-Alignment: Bridging Automatic Speech Recognition with LLM on Edge (Jinjun Xiong)
Section 6: Analysis: Do Edge Large Language Models Allow Fair Access to Healthcare for
Language- Impaired Users? (Yiyu Shi)
Event Type
Tutorial
TimeSunday, June 223:30pm - 5:00pm PDT
Location3008, Level 3
AI
Sunday Program