There are eight major dialectal regions in China. In addition to Mandarin
(Northern China), there is Wu (Southern Jiangsu, Zhejiang, and Shanghai),
Yue (Guangdong, Hong Kong, Nanning Guangxi), Min (Fujian,Shantou
Guangdong, Haikou Hainan, Taipei Taiwan), Hakka (Meixian Guangdong,
Hsin-chu Taiwan), Xiang (Hunan), Gan (Jiangxi), Hui (Anhui), and Jin
(Shanxi). These dialects can be further divided into more than 40
sub-categories.
Although the Chinese dialects share a written language and "standard"
Chinese (i.e. Mandarin) is widely spoken in most regions, speech is still
strongly influenced by the native dialects. This great linguistic
diversity poses severe problems for automatic speech and language
technology. Automatic speech recognition, for instance, relies to a great
extent on the consistent pronunciation and usage of words within a
language. But in Chinese, word-usage, pronunciation, and even syntax and
grammar vary depending on the speaker's dialect. As a result speech
recognition systems constructed to process Mandarin perform poorly for the
great majority of the population. The goal of our summer project is to
develop a general framework to model phonetic, lexical, and pronunciation
variability in spoken Chinese. We will investigate several ways in which
a modest amount of transcribed speech in a particular dialect can be used
to adapt the acoustic and lexical models originally estimated on Mandarin
speech to improve recognition accuracy for that dialect.
|