Parallelizing Dense Matrix Factorizations with Malleable Thread-Level BLAS

Friday November 9, 2018
Location: Hamerschlag Hall D210
Time: 11:00AM-12:00PM

Abstract

This talk will elaborate on a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded version of BLAS. This approach is also different from the more sophisticated runtime-assisted implementations, which decompose the operation into tasks and identify dependencies via directives and runtime support. Instead, our strategy attains high performance by explicitly embedding a static look-ahead technique into the DMF code, in order to overcome the performance bottleneck of the panel factorization, and realizing the trailing update via a cache-aware multi-threaded implementation of the BLAS. The parallel algorithms will be described with a high-level of abstraction, which allows a direct derivation of the actual implementation, paving the road to obtaining a high performance implementation of a considerable fraction of LAPACK functionality on any multicore platform with an OpenMP-like runtime.

Bio

Enrique S. Quintana-Ortí received the bachelor and Ph.D. degrees in computer sciences from the Universidad Politecnica de Valencia, Spain, in 1992 and 1996, respectively. Currently, he is a Professor in Computer Architecture in the Universidad Jaume I, Castellón, Spain. He has published more than 200 papers in international conferences and journals, and has contributed to software libraries like PLiC/SLICOT, MAGMA, FLARE, BLIS and libflame for control theory and parallel linear algebra. He has also been member of the programme committe for around 100 international conferences. In 2008 Enrique received an NVIDIA professor partnership award for his contributions to the acceleration of dense linear algebra kernels on graphics processors, and he also received two awards from NASA for his contributions to fault-tolerant dense linear algebra libraries for space vehicles. Recently, he has participated/participates in EU projects on parallel programming, such as TEXT, INTERTWinE, and energy efficiency such as EXA2GREEN and OPRECOMP. His current research interests include parallel programming, linear algebra, energy consumption, transprecision computing and bioinformatics as well as advanced architectures and hardware accelerators.