Close

Presentation

LUT-MM: An Efficient Lookup Table-Based Approach for Modular Multiplication
DescriptionDue to their merits that guarantee the available but invisible property of data during processing, privacy-preserving computing techniques have attracted wide attentions from both academic and industrial fields. However, performance bottleneck has been restricting its large-scale application. Although the protocols and algorithms of privacy-preserving computing may vary, some underlying fundamental operations such as modular multiplication and modular addition are widely and frequently used. The efficiency of modular multiplication has a significant impact on the overall performance of privacy-preserving computations. Existing implementation approaches for modular multiplication are either based on Montgomery's or Barrett's algorithm. However, both of these methods require at least three times of full-word multiplications to compute the modular result, making it difficult to balance throughput and chip area for hardware accelerator design. In this paper, we present a novel architecture for modular multiplication called LUT-MM to seek for optimal tradeoff between throughput and area. LUT-MM can also achieve better performance if large area is acceptable. LUT-MM integrates a full-word multiplier based on Karatsuba's algorithm, along with a 3-stage reduction module based on multiple look-up tables. Experimental results show that the proposed LUT-MM achieves a throughput of 28700 Mbps while consuming 34938 LUTs, 13479 FFs, and 105 BRAMs on the Xilinx Virtex-7 FPGA, demonstrating superior tradeoff between throughput and area.