Deep Learning of Generative Models – Yoshua Bengio (University of Montreal)

March 11, 2014 all-day
3400 N Charles St
Baltimore, MD 21218

Deep learning has been highly successful in recent years mostly thanks to progress in algorithms for training deep but supervised feedforward neural networks. These deep neural networks have become the state-of-the-art in speech recognition, object recognition, and object detection. What’s next for deep learning? We argue that progress in unsupervised deep learning algorithms is a key to progress on a number of fronts, such as better generalization to new classes from only one or few labeled examples, domain adaptation, transfer learning, etc. It would also be key to extend the output spaces from simple classification tasks to structured outputs, e.g., for machine translation or speech synthesis. This talk discusses some of the challenges involved in unsupervised learning of models with latent variables for AI tasks, in particular the difficulties due to the partition function, mixing between modes, and the potentially huge number of real or spurious modes. The manifold view of deep learning and experimental results suggest that many of these challenges could be greatly reduced by performing the hard work in the learned higher-level more abstract spaces discovered by deep learning, rather than in the space of visible variables. Further gains are seeked by exploiting the idea behind GSNs (Generative Stochastic Networks) and denoising auto-encoders: learning a Markov chain operator that generates the desired distribution rather than parametrizing that distribution directly. The advantage is that each step of the Markov chain transition involves fewer modes, i.e., a partition function that can be more easily approximated.

Yoshua Bengio (CS PhD, McGill University, 1991) was post-doc with Michael Jordan at MIT and worked at AT&T Bell Labs before becoming professor at U. Montreal. He wrote two books and around 200 papers, the most cited being in the areas of deep learning, recurrent neural networks, probabilistic learning, NLP and manifold learning. Among the most cited Canadian computer scientists and one of the scientists responsible for reviving neural networks research with deep learning in 2006, he sat on editorial boards of top ML journals and of the NIPS foundation, holds a Canada Research Chair and an NSERC chair, is a Fellow of CIFAR and has been program/general chair for NIPS. He is driven by his quest for AI through machine learning, involving fundamental questions on learning of deep representations, the geometry of generalization in high-dimension, manifold learning, biologically inspired learning, and challenging applications of ML. In February 2014, Google Scholar finds almost 16000 citations to his work, yielding an h-index of 55.

Center for Language and Speech Processing