Close

Presentation

MIRACLE: Multimodal Information Retrieval via a Combined In-Memory Processing and Content Addressable Memory Approach
DescriptionThe rapid advancement of information technology has brought multimodal information retrieval
into the research spotlight. Neural networks, particularly Transformers, have emerged as the dominant solution for extracting multimodal feature vectors. While neural network acceleration has been extensively explored, the subsequent retrieval stage in multimodal scenarios remains under-optimized. Conventional retrieval approaches, such as cosine similarity sorting on von Neumann architectures, suffer from significant data migration and computational inefficiencies. Hashing methods enhance storage and computation efficiency but encounter challenges in energy-efficient implementation and mitigating accuracy losses due to modal heterogeneity. This paper presents a hybrid architecture that integrates in-memory processing (PIM) and content-addressable memory (CAM) to address these challenges. Transformer-extracted features are processed via in-memory random hashing leveraging device-intrinsic properties, with CAM facilitating parallel search space reduction. A final cosine similarity reranking stage refines the results while balancing accuracy with energy efficiency. Experimental evaluations validate that the proposed method, when compared to the baseline traditional CPU-based cosine similarity retrieval, 1) achieves almost identical level of accuracy, dramatically outperforming other pure CAM-based Hamming distance retrieval approaches; and 2) reduces latency by 9.45× and energy consumption by 30.20×.