BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260402T024507Z
LOCATION:3001\, Level 3
DTSTART;TZID=America/Los_Angeles:20250625T103000
DTEND;TZID=America/Los_Angeles:20250625T120000
UID:dac_DAC 2025_sess114@linklings.com
SUMMARY:Smarter Compute, Faster Inference: Optimizing AI Systems on Edge
DESCRIPTION:As AI continues to push toward real-time and resource-efficien
 t processing, optimizing both compute and memory across diverse hardware p
 latforms becomes crucial. This session introduces techniques that aim to i
 mprove AI efficiency at the edge, including federated learning frameworks 
 that account for real-world constraints, proactive strategies for mitigati
 ng inference cold starts, and novel data streaming techniques that decoupl
 e memory access. Additionally, new approaches in cross-layer simulation, t
 ask-oriented detection, and low-latency graph processing on FPGAs showcase
  how hardware-software co-design can unlock smarter, faster, and more effi
 cient AI systems.\n\nDataMaestro: A Versatile and Efficient Data Streaming
  Engine Bringing Decoupled Memory Access To Dataflow Accelerators\n\nDeep 
 Neural Networks (DNNs) have achieved remarkable success across various int
 elligent tasks but encounter performance and energy challenges in inferenc
 e execution due to data movement bottlenecks. We introduce DataMaestro, a 
 versatile and efficient data streaming unit that brings the decoupled acc.
 ..\n\n\nXiaoling Yi, Yunhao Deng, Ryan Antonio, and Fanchen Kong (KU Leuve
 n); Guilherme Paim (INESC-ID); and Marian Verhelst (KU Leuven)\n----------
 -----------\nSimPhony: A Device-Circuit-Architecture Cross-Layer Modeling 
 and Simulation Framework for Heterogeneous Electronic-Photonic AI System\n
 \nElectronic-photonic integrated circuits (EPICs) present a transformative
  solution for next-generation high-performance artificial intelligence (AI
 ). The advancement of EPIC AI systems, however, requires extensive interdi
 sciplinary research across devices, circuits, architecture, and design aut
 omatio...\n\n\nZiang Yin (Arizona State University); Meng Zhang, Amir Bego
 vic, and Zhaoran Huang (Rensselaer Polytechnic Institute); and Jeff Zhang 
 and Jiaqi Gu (Arizona State University)\n---------------------\nPaSK: Cold
  Start Mitigation for Inference with Proactive and Selective Kernel Loadin
 g on GPUs\n\nToday, DNN inference is widely adopted, with numerous inferen
 ce services being spawned from scratch across instances scenarios such as 
 spot serving, serverless scaling and edge computing, where frequent start-
 stops are required. In this work, we first delve into the inference workfl
 ow and uncover th...\n\n\nXuanteng Huang, Jiangsu Du, Nong Xiao, and Xianw
 ei Zhang (Sun Yat-sen University)\n---------------------\nFLAG: An FPGA-Ba
 sed System for Low-Latency GNN Inference Service Using Vector Quantization
 \n\nEnabling real-time GNN inference services requires low end-to-end late
 ncy to meet service level agreements. However, intensive preparation steps
  and the neighborhood explosion problem pose significant challenges to eff
 icient GNN inference serving. In this paper, we propose FLAG, an FPGA-base
 d GNN in...\n\n\nYunki Han, Taehwan Kim, Jiwan Kim, Seohye Ha, and Lee-Sup
  Kim (Korea Advanced Institute of Science and Technology (KAIST))\n-------
 --------------\niTaskSense: Task-Oriented Object Detection in Resource-Con
 strained Environments\n\nTask-oriented object detection is increasingly es
 sential for intelligent sensing applications, enabling AI systems to opera
 te autonomously in complex, real-world environments such as autonomous dri
 ving, healthcare, and industrial automation. Conventional models often str
 uggle with generalization, re...\n\n\nSungHeon Jeong, Hamza Errahmouni Bar
 kam, Hyunwoo Oh, Hanning Chen, Tamoghno Das, Zhen Ye, and Mohsen Imani (Un
 iversity of California, Irvine)\n---------------------\nPracMHBench: Re-ev
 aluating Model-Heterogeneous Federated Learning Based on Practical Edge De
 vice Constraints\n\nFederating heterogeneous models on edge devices with d
 iverse resource constraints has been a notable trend in recent years. Comp
 ared to traditional federated learning (FL) that assumes an identical mode
 l architecture to cooperate, model-heterogeneous FL is more practical and 
 flexible since the model...\n\n\nYuanchun Guo, Bingyan Liu, yulong sha, an
 d zhensheng xian (Beijing University of Posts and Telecommunications)\n\nT
 opics: AI\n\nTracks: AI4: AI/ML System and Platform Design\n\nSession Chai
 rs: Xiaoxuan Yang (University of Virginia, Stanford University) and Shihao
  Song (Nvidia)
END:VEVENT
END:VCALENDAR
