CRAN: A Computational Redundancy-aware Accelerator for Convolutional Neural Networks
DescriptionRecently, studies to reduce memory requirements have resulted in many redundant values in the model. This paper presents an architecture that reduces the computation demands of CNN inference by eliminating redundant computations in these models. By analyzing the computation pattern, we can predict the location where the redundant computation occurs. We introduce a new mapping strategy that can eliminate redundant computations and reuse computation results. The proposed technique improves performance and saves energy by reducing the required number of computations. Experimental results show that the proposed architecture achieves an overall speedup of up to 1.66x and saves energy by 54%.
TimeTuesday, July 11th6:00pm - 7:00pm PDT
LocationLevel 2 Lobby