Presentation
Extending RISC-V based GPGPU for fast execution of regular data-intensive kernels
DescriptionVortex, a newly proposed open-source GPGPU platform based on the RISC-V ISA, offers a valid alternative for GPGPU research over the broadly-used modeling platforms based on commercial GPU's. Similarly to the push originating from the RISC-V movement for CPUs, Vortex can enable a myriad of fresh research directions for GPUs. However, as a young hardware platform, it lacks the performance competitiveness necessary for wide adoption. Particularly, it underperforms for regular, memory-intensive kernels like linear algebra routines, which form the basis of many applications, including Machine Learning. For such kernels, we identified the control flow (CF) management overhead and memory orchestration as the main causes of performance degradation in the open-source Vortex GPGPU. To overcome these problems, this paper proposes 1.) a hardware CF manager to accelerate branching and predication in regular loop execution and 2.) decoupled memory streaming lanes to further hide memory latency with useful computation. The evaluation results for different kernels show 8x faster execution, 10x reduction in dynamic instruction count, and performance improvement from 0.35 to 1.63 GFLOP/s/mm2. Thanks to the proposed enhancements, Vortex can become an ideal playground to enable GPGPU research for the next generation of Machine Learning.
Event Type
Networking
Work-in-Progress Poster
TimeSunday, June 226:00pm - 7:00pm PDT
LocationLevel 3 Lobby