Important Dates
Join our mailing list! |
Uncertainty Handling for Environment-Robust Speech Recognition
Abstract
In today's world, where mobile computing is more prevalent than any time in the history, automatic speech recognition (ASR) in environments with non-stationary noise remains a very challenging problem. The ubiquity of speech applications for hand-held devices, best exemplified by the recent success of personal assistant Siri on the iPhone 4S, requires ASR systems to deal with a wide variety of acoustic environments. Furthermore, the short interaction times left very little information for ASR systems to adapt. OutlineUncertainty Propagation/Decoding General Overview [R. F. Astudillo, 15min]
Log-Feature and Model Domain Approaches to Uncertainty Handling in ASR [L. Deng, 1h]
Linear-STFT Domain Approaches to Uncertainty Handling in ASR [R. F. Astudillo, 1h]
Learning from Noisy data [E. Vincent, 30min]
Wrap-up and perspectives [E. Vincent, 15min]
Short BiographiesRamon Fernandez Astudillo
Spoken Language Laboratory, INESC-ID-Lisboa, Lisboa, Portugal
Website: https://www.l2f.inesc-id.pt/~ramon
Ramon F. Astudillo obtained the industrial engineering degree with specialization electronics in automatic regulation at the Escuela Politecnica Superior de Ingenieria de Gij´on (Spain) in 2005, completing the last two years of this degree with an Erasmus scholarship at the Technische Universit¨at Berlin. In 2006 he worked as an intern at Peiker Acustic researching model-based speech enhancement. On this same year he was awarded with a La Caixa and the German Academic Exchange Service (DAAD) scholarship for research towards the Ph.D. degree. He obtained the title with distinction from the Technische Universit¨at Berlin in 2010 in the fields of speech processing and robust automatic speech recognition. Dr. Astudillo is currently a Post.- Doc. researcher at INESC-ID in Lisbon, researching both on robust speech recognition and robust natural language processing speech applications in a Bayesian setting. He is also an ISCA member and reviewer of IEEE-TASLP/SPL as well as CSL.
Emmanuel Vincent is a Research Scientist with the French National Institute for Research in Computer Science and Control (INRIA, Rennes, France). He holds a PhD degree in signal processing from University Pierre et Marie Curie (Paris, France) and worked as a Research Assistant with the Centre for Digital Music at Queen Mary, University of London (London, U.K.) from 2004 to 2006. His research focuses on probabilistic machine learning for speech and audio signal processing, with application to real-world audio source localization and separation, noise-robust speech recognition and music information retrieval. He has authored more than 90 papers in these fields and currently serves as an Associate Editor for IEEE T-ASL and as a Guest Editor for the special issue of CSL on speech separation and recognition in multisource environments. He is also the Founding Chair of the annual Signal Separation Evaluation Campaign (SiSEC) and an organizer of the PASCAL ’CHiME’ Speech Separation and Recognition Challenge. His achievements have recently been honored by the 2012 SPIE ICA Unsupervised Learning Pioneer Award. Li Deng joined the Department of Electrical and Computer Engineering, University ofWaterloo, Waterloo, ON, Canada, in 1989 as an Assistant Professor, where he became a Full Professor in 1996. In 1999, he joined Microsoft Research, Redmond, WA, where he is currently a Principal Researcher. Since 2000, he has also been an Affiliate Full Professor in the Department of Electrical Engineering, University of Washington, Seattle. Prior to Microsoft Research, he also worked or taught at the Massachusetts Institute of Technology (Cambridge, MA), ATR Interpreting Telecommunications Research Laboratories (Kyoto, Japan), Hong Kong University of Science and Technology, and Nortel (Canada). In the general areas of audio/speech/language processing, neural information processing, digital communication, and machine learning, he has published over 300 refereed papers and 3 books. He has been granted over 60 patents. He is a Fellow of the IEEE, the Acoustical Society of America and ISCA, and is ISCAs Distinguished Lecturer. He has received awards/honors bestowed by the IEEE, ISCA, ASA, Microsoft, and other organizations. He served on the Board of Governors of the IEEE Signal Processing Society (2008-2010). He served as Editor-in-Chief for the IEEE Signal Processing Magazine (2009- 2011), which, according to the Thomson Reuters Journal Citation Report released June 2010 and 2011, ranks first in both years among all IEEE publications and all publications within the Electrical and Electronics Engineering Category worldwide in terms of its impact factor. He currently serves as Editor-In-Chief of IEEE Trans. Audio, Speech, and Language Processing.
|



.gif)

.jpg)

.png)
.png)

.png)




.jpg)
