Boqing Gong (Google) “Towards Visual Recognition in the Wild: Long-Tailed Sources and Open Compound Targets”

December 7, 2020 @ 12:00 pm – 1:15 pm
via Zoom


Fueling deep learning models with big, curated datasets can yield unprecedented results for recognizing objects, scenes, human activities, and attributes. However, as we continue to advance the boundary of visual recognition and the number of classes scales up, long tails become the elephant in the room since object frequency often follows a power law in the real world. That is the challenge of learning. During inference in the wild, model robustness becomes crucial because the wild data is often out of the training data’s distribution (e.g., adversarial examples, data of new domains, etc.).

In this talk, I will present our recent work on long-tailed visual recognition and compound domain adaptation. We develop novel methods by drawing inspiration from meta-learning, memory networks, adversarial training, and curriculum learning. I will also present some empirical studies that verify our approaches’ effectiveness and demonstrate their applications to query-efficiency black-box adversarial attacks.


Boqing Gong is a research scientist at Google, Seattle, and a principal investigator at ICSI, Berkeley. His research in machine learning and computer vision focuses on sample-efficient learning (e.g., domain adaptation, few-shot, reinforcement, webly-supervised, and self-supervised learning) and the visual analytics of objects, scenes, human activities, and their attributes. Before joining Google in 2019, he worked in Tencent and was a tenure-track Assistant Professor at the University of Central Florida (UCF). He received an NSF CRII award in 2016 and an NSF BIGDATA award in 2017, both of which were the first of their kinds ever granted to UCF. He is/was a (senior) area chair of NeurIPS, ICML, CVPR, ICCV, ECCV, AAAI, AISTATS, and WACV. He earned a Ph.D. degree in 2015 at the University of Southern California, where the Viterbi Fellowship partially supported his work.

Center for Language and Speech Processing