Important Dates
Join our mailing list! |
Computer-Assisted Language Learning (CALL) Systems
OverviewComputer-assisted language learning (CALL) provides an effective learning environment so that students can practice in an interactive manner using multi-media content, either with the supervision of teachers or on their own pace in self-learning. The advancement of speech and language technologies has opened new perspectives on CALL systems, such as automatic pronunciation assessment and simulated conversational-style lessons. CALL is also regarded as one of new and promising applications of speech analysis, recognition and synthesis. CALL covers a variety of aspects including segmental, prosodic and lexical features. Modeling non-native speech to correctly segment/recognize utterances while detecting errors included in them poses a number of challenges in speech processing. Assessing intelligibility of non-native speech or proficiency of non-native speakers is also an important issue. In this tutorial, we will give an overview on these issues and current solutions. The tutorial is mainly targeted for speech researchers and engineers interested in CALL, but also for those engaged in language teaching or learning technology.
First we review speech recognition technologies for pronunciation learning, specifically pronunciation evaluation and error detection. Statistical approaches to these problems are formulated, and then acoustic and pronunciation modeling of non-native speech is described. Unlike the conventional non-native speech recognition, error detection capability is required in CALL, thus an effective error prediction scheme is vitally important. Next, we address prosodic modeling and evaluation, such as duration, stress and tones, and then the use of speech synthesis technologies including re-synthesis and morphing.
After the review of basic component technologies, we introduce a number of practical CALL systems which have been developed as commercial products or deployed in classrooms, including those in our universities. Majority of them focus on learning English as a second language (ESL), but some deal with other languages such as Japanese and Chinese. We also review databases of non-native speech, which are necessary to develop CALL systems.
Outline1. Introduction and Overview (Kawahara) Review history and category of CALL systems. 2. Segmental aspect and speech recognition technology (Kawahara) 2.1. Speech analysis for CALL 2.2. Segmentation of non-native speech 2.3. Error detection of non-native speech 2.4. Scoring of non-native speech 2.5. Acoustic model for non-native speech 2.6. Pronunciation model for non-native speech 2.7. Discriminative modeling 3. Prosodic aspect (Minematsu) 3.1. Prosodic deviations found in non-native pronunciation 3.2. Duration modeling & evaluation 3.3. Stress and tone modeling & evaluation 3.4. Intonation modeling & evaluation 4. Speech synthesis technology for CALL (Minematsu) 4.1. Text-to-speech for CALL 4.2. Re-synthesis for CALL 4.3. Morphing for CALL 5. Practical CALL systems (Kawahara) Review major CALL systems that have been developed and deployed for learning English and other languages. 6. Database for CALL (Minematsu) Review major databases of non-native speech, which are critical resources in developing CALL systems.
Short BiographiesTatsuya Kawahara is a professor in Academic Center for Computing and Media Studies and an affiliated professor in School of Informatics, Kyoto University. He has also been an invited researcher at ATR and NICT. He was a visiting researcher at Bell Laboratories from 1995 to 1996. He has published more than 200 technical papers on speech recognition, spoken language processing, and spoken dialog systems. He has been managing several speech-related projects including a free speech recognition engine Julius (http://julius.sourceforge.jp/) and the automatic transcription system for the Japanese Parliament (Diet). From 2003 to 2006, he was a member of IEEE SPS Speech Technical Committee. From 2011, he is a secretary of IEEE SPS Japan Chapter. He was a general chair of IEEE Automatic Speech Recognition & Understanding workshop (ASRU 2007). He has also served as a tutorial chair of INTERSPEECH 2010 and a local arrangement chair of ICASSP 2012. He is an editorial board member of Elsevier Journal of Computer Speech and Language, ACM Transactions on Speech and Language Processing, and APSIPA Transactions on Signal and Information. He is a senior member of IEEE. E-mail: kawahara@i.kyoto-u.ac.jp Webpage: http://www.ar.media.kyoto-u.ac.jp/members/kawahara/
Nobuaki Minematsu is an associate professor in Graduate School of Information Science and Technology, the University of Tokyo. He was a visiting researcher at Royal Institute of Technology, Sweden (KTH) from 2002 to 2003. He has a very wide interest in speech communication covering from science to engineering. He has published more than 200 scientific and technical papers including conference papers. Those papers are on speech analysis, speech perception, speech recognition, speech synthesis, language learning systems, etc. He was a member of the organizing committee of Speech Prosody 2004, L2WS 2010, INTERSPEECH 2010. From 2006, he is a member of SLaTE (ISCA SIG on Speech and Language Technology in Education). From 2011, he is a treasurer of IEEE SPS Japan Chapter. He has also been serving as an editorial board member of Acoustic Society of Japan, The Institute of Electronics, Information and Communication Engineers, and Information Processing Society of Japan. E-mail: mine@gavo.t.u-tokyo.ac.jp Webpage: http://www.gavo.t.u-tokyo.ac.jp/~mine/
|



.gif)

.jpg)

.png)
.png)

.png)




.jpg)
