SpMMPlu: A Compiler Plug-in with Sparse IR for Efficient Sparse Matrix Multiplication
DescriptionSparsity is becoming arguably the most critical dimension to explore for efficiency and scalability as deep learning models grow significantly larger and more complex. In this paper, we propose a compiler plug-in named SpMMPlu, which can extend the representation and optimization ability for SpMM in current deep learning compiler frameworks. The key of SpMMPlu is a flexible intermediate representation---Sparse IR, representing the SpMM with various sparsity patterns by splitting the whole computing space into small and dense computational blocks (Meta ops). SpMMPlu delivers1.7x-8.4x average speedup on inference latency compared to the corresponding dense kernel.
TimeThursday, July 13th10:40am - 10:55am PDT
Location3003, 3rd Floor
AI/ML Application and Infrastructure