Piotr Szymanski (Wroclaw University of Science and Technology) “Bayesian Approaches to Model Comparison in ML/NLP and Quantifying the ASR-NLP Gap”

When:
October 12, 2020 @ 12:00 pm – 1:15 pm
2020-10-12T12:00:00-04:00
2020-10-12T13:15:00-04:00
Where:
via Zoom
Cost:
Free

Abstract

This is a two-part seminar. The first part will be dedicated to discussing how we can apply bayesian approaches to model comparison in NLP (and also ML in general), the drawbacks and limitations of non-parametric hypothesis testing and experimental reproducibility. A related paper to this part is my and Kyle Gorman’s work from this years EMNLP: https://arxiv.org/abs/2010.03088

The second part of the seminar, a shorter one, will be dedicated to the work we started with Piotr Żelasko in Avaya, concerning identifying and bridging the performance gap between currently available ASR systems and NLP models for downstream language understanding tasks that limits the ability to deliver high quality spoken language understanding among other in the area of spontaneous conversations. This is linked to a series of papers we’ve been working on, first of them just came out and was accepted at EMNLP Findings: https://arxiv.org/abs/2010.03432

I’d like to finish the seminar by starting a discussion about things we could perhaps do together in the future in the area of measuring ASR+NLP performance.

Biography

Piotr Szymański is an assistant Professor at the Department of Computational Intelligence at the Wrocław University of Science and Technology and a Machine Learning Engineer at Avaya. Professionally involved in data analysis, statistical reasoning, geospatial data science, natural language processing, machine learning and artificial intelligence techniques. He is an alumni of the Top 500 Innovators program at Stanford University, worked in several institutions over the years incl. Hasso Plattner Institute in Potsdam, Josef Stefan Institute in Ljubljana, University of Notre Dame and University of Technology Sydney. He is the author of scikit-multilearn, a popular python library for multi-label classification. Apart from multi-label classification, Piotr published papers concerning urban data, traffic analysis and bridging the gap between ASR and NLP in spoken language understanding systems. In his free time he is an urban activist in Wrocław and a member of a city district council.

Center for Language and Speech Processing