Session – Artificial Intelligence and Data

A General-purpose Parallel and Heterogeneous Task Programming Systems at Scale  

Speaker: Dr. Tsung-Wei Huang (黃琮蔚)

Assistant Professor

Department of Electrical and Computer Engineering

University of Utah

Salt Lake City, UT 84112




Dr. Tsung-Wei Huang is an assistant professor in the Department of Electrical and Computer Engineering (ECE) at the University of Utah. His research focuses on making parallel and heterogeneous computing easier to handle. Dr. Huang received his PhD from the University of Illinois at Urbana-Champaign (UIUC) in 2017. During the entire career, he has been building software from the ground up with extensive research interests in parallel processing, computer-aided design, and machine learning. He is the stakeholder of several award-winning software and the recipient of the 2019 ACM/SIGDA Outstanding Ph.D. Dissertation Award.


Modern scientific computing applications rely on a heterogeneous mix of computational resources that comprises manycore CPUs and GPUs to achieve transformational performance milestones. These milestones are not possible without the aid of high-level programming systems and runtimes to assist in the implementation complexity. Decades of research in high productivity computing has yielded methodologies and languages that offer either programmer productivity or performance scalability, but rarely both simultaneously. The primary goal of this talk is thus to address the long-standing question of “how can we make it easier for developers to quickly write parallel and heterogeneous programs with high performance and simultaneous high productivity?” I will talk about a general-purpose task programming system we are developing to streamline the creation of parallel applications on CPUs and GPUs. Compared with existing frameworks, our system is very cost-efficient in exploiting high degrees of parallelism, including dynamic control flow and irregular computational patterns. On a particular circuit simulation workload, we achieved up to 5 speed-up over industrial-strength systems and boosted the programming productivity by 100. I will also present our recent effort on accelerating a large-scale machine learning problem that has received the champion award in the 2020 IEEE HPEC Graph Challenge.