René Vidal (Johns Hopkins University): Mathematics of Deep Learning

When:
July 6, 2018 @ 9:00 am – 10:00 am
2018-07-06T09:00:00-04:00
2018-07-06T10:00:00-04:00
Where:
Hackerman Hall, Room B17
Abstract

The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. A key issue is that the neural network training problem is non-convex, hence optimization algorithms may not return a global minima. In addition, the regularization properties of algorithms such as dropout remain poorly understood. Building on ideas from convex relaxations of matrix factorizations, this work proposes a general framework which allows for the analysis of a wide range of non-convex factorization problems – including matrix factorization, tensor factorization, and deep neural network training. The talk will describe sufficient conditions under which a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a local descent algorithm. The talk will also present an analysis of the optimization and regularization properties of dropout in the case of matrix factorization.

 

Bio

René Vidal is the Herschel L. Seder Professor in the Department of Biomedical Engineering. He joined Johns Hopkins in 2004. He holds joint appointments in the departments of Electrical and Computer Engineering, Computer Science, and Mechanical Engineering. He is the director of the Mathematical Institute for Data Science and the Vision Dynamics and Learning Lab, and is also a professor in the Institute for Computational Medicine, the Center for Imaging Science, and the Laboratory for Computational Sensing and Robotics.

Vidal’s research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. His lab creates new technologies for a variety of biomedical applications, including detection, classification, and tracking of blood cells in holographic images, classification of embryonic cardio-myocytes in optical images, and assessment of surgical skill in surgical videos.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing