CLSP Homepage : Workshop Homepage
Workshop 2006
Workshop 2006 Calendar Monday, November 23, 2009

Events for Thursday, July 27, 2006


Undergraduate Brown Bag Lunch
12:00 pm
Barton 320A

A meeting of all WS06 Undergraduates, WS and CLSP faculty from noon to 1 p.m.


Dealing with Noise: Spectral Estimation, Beamforming, and Video
1:30 pm
Shaffer 100

These lectures will organize classical and new ideas for acoustic noise amelioration. Specifically, the goal of these lectures is to cover the noise amelioration methods currently being tested for HMM and DBN recognizers operating on the AVICAR corpus. Lectures will be limited to one hour each; slides will be posted on the web afterward, with citations. Lecture 3: Video Video of a talkers face is immune to acoustic noise, but suffers from different kinds of variability, including variability in lighting, head pose variation, variation in talker facial features, and variability in the synchronization between audio and video manifestations of the same phoneme. In order to perform automatic video speechreading, the system must first detect relevant objects in the image: the face, possibly the lips, possibly the nose, teeth, tongue, chin, and eyes. Video speechreading features include the relative positions of the detected objects the "geometry" features and small DCT-transformed pixel patches in the vicinity of each object the "pixel-based" features. Geometry features are immune to lighting variability if correctly detected, and can be corrected for head-pose variability in obvious ways. Pixel-based features can be easily corrected for lighting variability, and I will argue that patch scaling is a useful way to correct for head-pose variability. Variation in talker facial features can be compensated using methods similar to MLLR, though in a much higher dimension. Finally, variability in synchronization between the audio and the video manifestatins of a phone can be compensated using a multi-stream hidden Markov model. Work performed at WS00 established the viseme/phoneme two-stream model as a standard. Work performed this summer at WS06 will seek to establish a new standard in which each articulator is a stream, and in which the audio and video features depend on the positions of the articulators.



Return to Calendar

Warning: readfile(/http/www/include/footer.htf): failed to open stream: No such file or directory in /home/web/www/ws2006/calendar/events.php on line 159