Important Dates

  • April 1, 2012
    Full Paper Submission Deadline
  • June 8, 2012
    Notification of Paper Acceptance
  • June 16, 2012
    Grant Application Deadline
  • June 22, 2012
    Camera-ready Paper Due
  • June 30, 2012
    Early Registration Deadline
    Deadline for Presenters to Register
  • August 8, 2012
    Hotel and Standard Registration Deadline

Join our mailing list!

Organizing Secretariat

Conference Solutions

 

Computer-Assisted Language Learning (CALL) Systems

 

Overview

Computer-assisted language learning (CALL) provides an effective learning environment so that students can practice in an interactive manner using multi-media content, either with the supervision of teachers or on their own pace in self-learning.  The advancement of speech and language technologies has opened new perspectives on CALL systems, such as automatic pronunciation assessment and simulated conversational-style lessons. CALL is also regarded as one of new and promising applications of speech analysis, recognition and synthesis. CALL covers a variety of aspects including segmental, prosodic and lexical features. Modeling non-native speech to correctly segment/recognize utterances while detecting errors included in them poses a number of challenges in speech processing.  Assessing intelligibility of non-native speech or proficiency of non-native speakers is also an important issue.  In this tutorial, we will give an overview on these issues and current solutions. The tutorial is mainly targeted for speech researchers and engineers interested in CALL, but also for those engaged in language teaching or learning technology.

 

First we review speech recognition technologies for pronunciation learning, specifically pronunciation evaluation and error detection. Statistical approaches to these problems are formulated, and then acoustic and pronunciation modeling of non-native speech is described. Unlike the conventional non-native speech recognition, error detection capability is required in CALL, thus an effective error prediction scheme is vitally important. Next, we address prosodic modeling and evaluation, such as duration, stress and tones, and then the use of speech synthesis technologies including re-synthesis and morphing.

 

After the review of basic component technologies, we introduce a number of practical CALL systems which have been developed as commercial products or deployed in classrooms, including those in our universities. Majority of them focus on learning English as a second language (ESL), but some deal with other languages such as Japanese and Chinese.  We also review databases of non-native speech, which are necessary to develop CALL systems.

 

Outline

1.   Introduction and Overview (Kawahara)

     Review history and category of CALL systems.

2.   Segmental aspect and speech recognition technology (Kawahara)

2.1.            Speech analysis for CALL

2.2.            Segmentation of non-native speech

2.3.            Error detection of non-native speech

2.4.            Scoring of non-native speech

2.5.            Acoustic model for non-native speech

2.6.            Pronunciation model for non-native speech

2.7.            Discriminative modeling

3.   Prosodic aspect (Minematsu)

3.1.            Prosodic deviations found in non-native pronunciation

3.2.            Duration modeling & evaluation

3.3.            Stress and tone modeling & evaluation

3.4.            Intonation modeling & evaluation

4.   Speech synthesis technology for CALL (Minematsu)

4.1.            Text-to-speech for CALL

4.2.            Re-synthesis for CALL

4.3.            Morphing for CALL

5.   Practical CALL systems (Kawahara)

Review major CALL systems that have been developed and deployed for learning English and other languages.

6.   Database for CALL (Minematsu)

Review major databases of non-native speech, which are critical resources in developing CALL systems.

 

Short Biographies

Tatsuya Kawahara is a professor in Academic Center for Computing and Media Studies and an affiliated professor in School of Informatics, Kyoto University.

He has also been an invited researcher at ATR and NICT. He was a visiting researcher at Bell Laboratories from 1995 to 1996. He has published more than 200 technical papers on speech recognition, spoken language processing, and spoken dialog systems. He has been managing several speech-related projects including a free speech recognition engine Julius (http://julius.sourceforge.jp/) and the automatic transcription system for the Japanese Parliament (Diet). From 2003 to 2006, he was a member of IEEE SPS Speech Technical Committee. From 2011, he is a secretary of IEEE SPS Japan Chapter. He was a general chair of IEEE Automatic Speech Recognition & Understanding workshop (ASRU 2007). He has also served as a tutorial chair of INTERSPEECH 2010 and a local arrangement chair of ICASSP 2012. He is an editorial board member of Elsevier Journal of Computer Speech and Language, ACM Transactions on Speech and Language Processing, and APSIPA Transactions on Signal and Information. He is a senior member of IEEE.

E-mail: kawahara@i.kyoto-u.ac.jp

Webpage: http://www.ar.media.kyoto-u.ac.jp/members/kawahara/

 

Nobuaki Minematsu is an associate professor in Graduate School of Information Science and Technology, the University of Tokyo. He was a visiting researcher at Royal Institute of Technology, Sweden (KTH) from 2002 to 2003. He has a very wide interest in speech communication covering from science to engineering. He has published more than 200 scientific and technical papers including conference papers. Those papers are on speech analysis, speech perception, speech recognition, speech synthesis, language learning systems, etc. He was a member of the organizing committee of Speech Prosody 2004, L2WS 2010, INTERSPEECH 2010. From 2006, he is a member of SLaTE (ISCA SIG on Speech and Language Technology in Education). From 2011, he is a treasurer of IEEE SPS Japan Chapter. He has also been serving as an editorial board member of Acoustic Society of Japan, The Institute of Electronics, Information and Communication Engineers, and Information Processing Society of Japan.

E-mail: mine@gavo.t.u-tokyo.ac.jp

Webpage: http://www.gavo.t.u-tokyo.ac.jp/~mine/

 

Thank you to our Sponsors

 

 

 

 

“Microsoft is a trademark of the Microsoft group of companies and is used under license from Microsoft.”

 

 

 

http://www.ets.org/

 

 

“Intel” and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.