Dialectal Chinese Speech Recognition


There are eight major dialectal regions in China. In addition to Mandarin (Northern China), there is Wu (Southern Jiangsu, Zhejiang, and Shanghai), Yue (Guangdong, Hong Kong, Nanning Guangxi), Min (Fujian,Shantou Guangdong, Haikou Hainan, Taipei Taiwan), Hakka (Meixian Guangdong, Hsin-chu Taiwan), Xiang (Hunan), Gan (Jiangxi), Hui (Anhui), and Jin (Shanxi). These dialects can be further divided into more than 40 sub-categories.

Although the Chinese dialects share a written language and "standard" Chinese (i.e. Mandarin) is widely spoken in most regions, speech is still strongly influenced by the native dialects. This great linguistic diversity poses severe problems for automatic speech and language technology. Automatic speech recognition, for instance, relies to a great extent on the consistent pronunciation and usage of words within a language. But in Chinese, word-usage, pronunciation, and even syntax and grammar vary depending on the speaker's dialect. As a result speech recognition systems constructed to process Mandarin perform poorly for the great majority of the population. The goal of our summer project is to develop a general framework to model phonetic, lexical, and pronunciation variability in spoken Chinese. We will investigate several ways in which a modest amount of transcribed speech in a particular dialect can be used to adapt the acoustic and lexical models originally estimated on Mandarin speech to improve recognition accuracy for that dialect.


The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu