peerRTF: Robust MVDR Beamforming Using Graph Convolutional Network – Sharon Gannot (Bar-llan University, Israel)

When:
November 24, 2025 @ 12:00 pm – 1:15 pm
2025-11-24T12:00:00-05:00
2025-11-24T13:15:00-05:00
Where:
Hackerman Hall B17
Cost:
Free

Abstract

The Relative Transfer Function (RTF) is defined as the ratio of Acoustic Transfer Functions (ATFs) relating a source to a pair of microphones after propagation in the enclosure. Numerous studies have shown that beamformers using RTFs as steering vectors significantly outperform counterparts that account only for the Direct Path, which has led to a plethora of methods aimed at improving estimation accuracy. In this talk, we focus on a beamformer that optimizes the Minimum Variance Distortionless Response (MVDR) criterion. Since RTF estimation degrades in noisy, highly reverberant environments, we propose leveraging prior knowledge of the acoustic enclosure to infer a low-dimensional manifold of plausible RTFs. Specifically, we harness a Graph Convolutional Network (GCN) to infer the acoustic manifold, thereby making RTF identification more robust. The model is trained and tested using real acoustic responses from the MIRaGe database recorded at Bar-Ilan University. This database contains multichannel room impulse responses measured from a high-resolution cube-shaped grid to multiple microphone arrays. This high-resolution measurement facilitates inference of the RTF manifold within a defined Region of Interest (ROI). The inferred RTFs are then employed as steering vectors of the MVDR beamformer. Experiments demonstrate improved RTF estimates and, consequently, better beamformer performance leading to enhanced sound quality and improved speech intelligibility under challenging acoustic conditions. Project Page, including audio demonstration and link to code: https://peerrtf.github.io/

Bio

Sharon Gannot is a Full Professor and Vice Dean in the Faculty of Engineering at Bar-Ilan University, where he heads the Data Science Program. He received the B.Sc. (summa cum laude) from the Technion and the M.Sc. (cum laude) and Ph.D. from Tel-Aviv University, followed by a postdoctoral fellowship at KU Leuven. His research focuses on statistical signal processing and machine learning for speech and audio, and he has authored more than 350 peer-reviewed publications on these topics. Among his editorial roles, he is Editor-in-Chief of Speech Communication, serves on the Senior Editorial Board of IEEE Signal Processing Magazine, is an Associate Editor for the IEEE-SPS Education Center, and has served as Senior Area Chair for IEEE/ACM TASLP (2013–2017; 2020–2025). Among his leadership roles, he chaired the IEEE-SPS Audio and Acoustic Signal Processing Technical Committee (2017–2018) and leads the SPS Data Science Initiative (since 2022); he also served as General Co-Chair of IWAENC 2010, WASPAA 2013, and Interspeech 2024. His recognitions include 13 best-paper awards, BIU teaching and research prizes, the 2018 Rector Innovation Award, the 2022 EURASIP Group Technical Achievement Award, and IEEE Fellow.

Center for Language and Speech Processing