Session Speaker-Trusted AI
Why Simple Algorithms Work So Well for Machine Learning and Signal Processing Problems with Nested Structures?
Dr. Tianyi Chen (陈天翼)
Department of Electrical, Computer, and Systems Engineering
Rensselaer Polytechnic Institute
Troy, NY 12180
Tianyi Chen has been with Rensselaer Polytechnic Institute (RPI) as an assistant professor since August 2019. Prior to joining RPI, he received the doctoral degree from the University of Minnesota (UMN). He has also held visiting positions at Harvard University, University of California, Los Angeles, and University of Illinois Urbana-Champaign. Dr. Chen is the inaugural recipient of IEEE Signal Processing Society Best PhD Dissertation Award in 2020 and a recipient of NSF CAREER Award in 2021. He is also a co-author of the Best Student Paper Award at the NeurIPS Federated Learning Workshop in 2020 and at IEEE ICASSP in 2021. Dr. Chen’s current research focuses on the theory and application of optimization, machine Learning, and statistical signal processing to problems emerging in data science and wireless communication networks.
Optimization is becoming the enabling factor of solving machine learning (ML) and signal processing (SP) problems. So far, majority of efforts have been made to scale up problems, with relatively simple structures, to the regimes of big data and large models. Stochastic, block-coordinated, decentralized, and federated algorithms are all effective means towards this end. However, many modern problems in ML and SP, such as meta learning, hyperparameter optimization, and reinforcement learning, inherently have nested structures, where one problem builds upon the solution of others. To tackle the problems with nested structures, we will introduce a unified alternating SGD algorithm (potentially on multiple variables) and presents a tighter analysis for using it in nested learning problems. Under certain regularity conditions, the sample complexity of alternating SGD matches the SGD’s sample complexity for the stochastic non-nested learning problems. Our results explain why simple SGD-type algorithms in stochastic nested problems all work very well in practice without the need for further modifications.