Close

Presentation

Energy-Efficient Large-Scale Vector Similarity Search in NAND-Flash via Hybrid Matching
DescriptionVector similarity search (VSS) is crucial in many AI applications, such as few-shot learning (FSL) and approximate nearest-neighbor search (ANNS), but it demands significant memory capacity and incurs substantial energy costs for data transfers during large-scale comparisons. Various in-memory search technologies have been developed to improve energy efficiency, with NAND-based multi-bit content-addressable memory (MCAM) standing out as a promising solution for its high density and large capacity. MCAM can operate in exact-search (ES) mode, supporting only perfect matches with low energy cost, or in approximate-search (AS) mode, enabling flexible VSS. However, AS mode incurs significant energy waste when comparing queries with non-target stored vectors.
To address this issue, we propose Hybrid-M, a 3D NAND-based in-memory VSS architecture that integrates both modes into a single hybrid matching process, using ES mode as a filter to reduce redundant searches for AS mode. We apply three techniques to optimize this integration: range encoding for multi-level cells (MLC) to enhance filtering, search voltage shifts to mitigate the impact on AS accuracy and reduce matching currents, and a filtering-aware training method to further improve reliability and energy efficiency. Results show that Hybrid-M achieves comparable accuracy while reducing energy consumption by 67% to 83% compared to MACM-based VSS using only AS mode, across various many-class FSL and ANNS workloads.