Accepted Papers


 

SESSION Mon.K1

Monday

11:00 - 12:00  Grand Ballroom

 

An Information-Extraction Approach to Speech Analysis and Processing

 

 

An Information-Extraction Approach to Speech Analysis and Processing
Chin-Hui Lee
Georgia Institute of Technology, USA


 

Paper Identifier: Mon.O1a.01

Monday

13:30 - 13:50  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
Dong Yu1,  Li Deng1,  Frank Seide2
1Microsoft Research, 2Microsoft Research Asia


 

Paper Identifier: Mon.O1a.02

Monday

13:50 - 14:10  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization
Brian Kingsbury,  Tara N. Sainath,  Hagen Soltau
IBM T. J. Watson Research Center, USA


 

Paper Identifier: Mon.O1a.03

Monday

14:10 - 14:30  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Discriminative feature-space transforms using deep neural networks
George Saon and Brian Kingsbury
IBM T.J. Watson Research Center, USA


 

Paper Identifier: Mon.O1a.04

Monday

14:30 - 14:50  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?
Zoltán Tüske,  Ralf Schlüter,  Ney Hermann,  Martin Sundermeyer
RWTH Aachen University, Germany


 

Paper Identifier: Mon.O1a.05

Monday

14:50 - 15:10  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Recurrent Neural Networks for Noise Reduction in Robust ASR
Andrew L. Maas1,  Quoc V. Le1,  Tyler M. O'Neil1,  Oriol Vinyals2,  Patrick Nguyen3,  Andrew Y. Ng1
1Stanford University, USA, 2UC Berkeley, USA, 3Google Inc., USA


 

Paper Identifier: Mon.O1a.06

Monday

15:10 - 15:30  Grand Ballroom I

 

ASR: Deep Neural Networks I

 

 

Pipelined Back-Propagation for Context-Dependent Deep Neural Networks
Xie Chen1,  Adam Eversole2,  Gang Li3,  Dong Yu2,  Frank Seide3
1Microsoft Research Asia and Tsinghua University, China, 2Microsoft Research, USA, 3Microsoft Research Asia, China


 

Paper Identifier: Mon.O1b.01

Monday

13:30 - 13:50  Grand Ballroom II

 

Language Recognition

 

 

Arabic Dialect Identification – ‘Is the Secret in the Silence?’ and Other Observations
Hynek Boril,  Abhijeet Sangwan,  John H.L. Hansen
University of Texas at Dallas, USA


 

Paper Identifier: Mon.O1b.02

Monday

13:50 - 14:10  Grand Ballroom II

 

Language Recognition

 

 

The 2011 NIST Language Recognition Evaluation
Craig Greenberg,  Alvin Martin,  Mark Przybocki
NIST, USA


 

Paper Identifier: Mon.O1b.03

Monday

14:10 - 14:30  Grand Ballroom II

 

Language Recognition

 

 

The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance
Luis J. Rodriguez-Fuentes1,  Mikel Penagarikano1,  Amparo Varona1,  Mireia Diez1,  German Bordel1,  Alberto Abad2,  David Martinez3,  Jesus Villalba3,  Alfonso Ortega3,  Eduardo Lleida3
1University of the Basque Country UPV/EHU, Spain, 2L2F, INESC-ID Lisboa, Portugal, 3University of Zaragoza, Spain


 

Paper Identifier: Mon.O1b.04

Monday

14:30 - 14:50  Grand Ballroom II

 

Language Recognition

 

 

Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts
Luis Fernando D'Haro1,  Ondřej Glembek2,  Oldřich Plchot2,  Pavel Matejka2,  Mehdi Soufifar2,  Ricardo Cordoba1,  Jan Černocký2
1Speech Technology Group, Dept. of Electronic Engineering. E.T.S.I. Telecomunicación. Universidad Politécnica de Madrid, Spain, 2Speech@FIT group, Brno University of Technology, Czech Republic


 

Paper Identifier: Mon.O1b.05

Monday

14:50 - 15:10  Grand Ballroom II

 

Language Recognition

 

 

Supervector LDA: A New Approach to Reduced-Complexity I-vector Language Recognition
Alan McCree and Bengt Borgstrom
MIT Lincoln Laboratory, USA


 

Paper Identifier: Mon.O1b.06

Monday

15:10 - 15:30  Grand Ballroom II

 

Language Recognition

 

 

Patrol Team Language Identification System for DARPA RATS P1 Evaluation
Pavel Matejka1,  Oldrich Plchot1,  Mehdi Soufifar1,  Ondrej Glembek1,  Luis Fernando D’Haro1,  Karel Vesely1,  Frantisek Grezl1,  Jeff Ma2,  Spyros Matsoukas2,  Najim Dehak3
1Brno University of Technology, 2Raytheon BBN Technologies, 3MIT Computer Science and Artificial Intelligence Laboratory


 

Paper Identifier: Mon.O1c.01

Monday

13:30 - 13:50  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Articulatory Strategies in Obstruent Production in Mandarin Esophageal Speech
Fang Hu1,  Yungang Wu2,  Wen Xu2,  Demin Han2
1Institute of Linguistics, Chinese Academy of Social Sciences, China, 2Beijing Tongren Hospital, China


 

Paper Identifier: Mon.O1c.02

Monday

13:50 - 14:10  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Consonantal space area in Children with a Cleft Palate An acoustic Study
Marion Bechet,  Fabrice Hirsch,  Camille Fauth,  Rudolph Sock
France


 

Paper Identifier: Mon.O1c.03

Monday

14:10 - 14:30  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Automated Dysarthria Severity Classification for Improved Objective Intelligibility Assessment of Spastic Dysarthric Speech
Milton Sarria Paja and Tiago H. Falk
Institut National de la Recherche Scientifique (INRS-EMT), Montreal, Canada


 

Paper Identifier: Mon.O1c.04

Monday

14:30 - 14:50  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Assessment of Disordered Voices Using Empirical Mode Decomposition in the Log-Spectral Domain
Abdellah Kacha1,  Francis Grenez2,  Jean Schoentgen2
1University of Jijel, Algeria, 2Université Libre de Bruxelles


 

Paper Identifier: Mon.O1c.05

Monday

14:50 - 15:10  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Learning an Artificial F0-Contour for ALT Speech
Anna Katharina Fuchs and Martin Hagmüller
Signal Processing and Speech Communication Laboratory


 

Paper Identifier: Mon.O1c.06

Monday

15:10 - 15:30  Pavilion East

 

Communication Disorders and Assistive Technologies

 

 

Ultrax: An Animated Midsagittal Vocal Tract Display for Speech Therapy
Korin Richmond and Steve Renals
CSTR, University of Edinburgh, UK


 

Paper Identifier: Mon.O1d.01

Monday

13:30 - 13:50  Pavilion West

 

Voice Conversion

 

 

A Study of Mutual Information for GMM-Based Spectral Conversion
Hsin-Te Hwang1,  Yu Tsao2,  Hsin-Min Wang3,  Yih-Ru Wang1,  Sin-Horng Chen1
1Dept. of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan, 2Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan, 3Institute of Information Science, Academia Sinica, Taipei, Taiwan


 

Paper Identifier: Mon.O1d.02

Monday

13:50 - 14:10  Pavilion West

 

Voice Conversion

 

 

Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion
Na Li and Yu Qiao
China


 

Paper Identifier: Mon.O1d.03

Monday

14:10 - 14:30  Pavilion West

 

Voice Conversion

 

 

Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice Transformation
Daniel Erro,  Eva Navas,  Inma Hernaez
UPV/EHU, Spain


 

Paper Identifier: Mon.O1d.04

Monday

14:30 - 14:50  Pavilion West

 

Voice Conversion

 

 

A HMM approach to residual estimation for high resolution voice conversion
Winston Percybrooks and Elliot Moore
Georgia Institute of Technology, USA


 

Paper Identifier: Mon.O1d.05

Monday

14:50 - 15:10  Pavilion West

 

Voice Conversion

 

 

Implementation of Computationally Efficient Real-Time Voice Conversion
Tomoki Toda1,  Takashi Muramatsu1,  Hideki Banno2
1Nara Institute of Science and Technology, 2Meijo University


 

Paper Identifier: Mon.O1d.06

Monday

15:10 - 15:30  Pavilion West

 

Voice Conversion

 

 

Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion
Daisuke Saito,  Nobuaki Minematsu,  Keikichi Hirose
The University of Tokyo, Japan


 

Paper Identifier: Mon.SS1.01

Monday

13:30 - 13:50  Galleria

 

Speaker Trait Challenge - Part 1

 

 

The INTERSPEECH 2012 Speaker Trait Challenge
Björn Schuller1,  Stefan Steidl2,  Anton Batliner2,  Elmar Nöth2,  Alessandro Vinciarelli3,  Felix Burkhardt4,  Rob van Son5,  Felix Weninger1,  Florian Eyben1,  Tobias Bocklet2,  Gelareh Mohammadi6,  Benjamin Weiss4
1Technische Universität München, Institute for Human-Machine Communication, Germany, 2FAU Erlangen-Nuremberg, Pattern Recognition Lab, Germany, 3University of Glasgow, School of Computing Science, Scotland, 4Deutsche Telekom AG Laboratories, Berlin, Germany, 5Netherlands Cancer Institute NKI-AVL, Amsterdam, The Netherlands, 6IDIAP Research Institute, Martigny, Switzerland


 

Paper Identifier: Mon.SS1.02

Monday

13:50 - 14:00  Galleria

 

Speaker Trait Challenge - Part 1

 

 

On Speaker-Independent Personality Perception and Prediction from Speech
Tim Polzehl1,  Katrin Schoenenberg1,  Sebastian Möller1,  Florian Metze2,  Gelareh Mohammadi3,  Alessandro Vinciarelli4
1Technische Universität Berlin / Telekom Innovation Laboratories, 2Carnegie Mellon University, 3Idiap Research Institute, 4University of Glasgow


 

Paper Identifier: Mon.SS1.03

Monday

14:00 - 14:10  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network
Kartik Audhkhasi,  Angeliki Metallinou,  Ming Li,  Shrikanth Narayanan
University of Southern California, USA


 

Paper Identifier: Mon.SS1.04

Monday

14:10 - 14:20  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Personality traits detection using a parallelized modified SFFS algorithm
Clément Chastagnol and Laurence Devillers
LIMSI-CNRS, France


 

Paper Identifier: Mon.SS1.05

Monday

14:20 - 14:30  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Feature Selection for Speaker Traits
Jouni Pohjalainen1,  Serdar Kadioglu2,  Okko Räsänen1
1Aalto University, Finland, 2Brown University, USA


 

Paper Identifier: Mon.SS1.06

Monday

14:30 - 14:40  Galleria

 

Speaker Trait Challenge - Part 1

 

 

A Frame Pruning Approach for Paralinguistic Recognition Tasks
Johannes Wagner,  Florian Lingenfelser,  Elisabeth André
Augsburg University, Germany


 

Paper Identifier: Mon.SS1.07

Monday

14:40 - 14:50  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Modulation Spectrum Analysis for Speaker Personality Trait Recognition
Alexei Ivanov and Xin Chen
Pearson KT, USA


 

Paper Identifier: Mon.SS1.08

Monday

14:50 - 15:00  Galleria

 

Speaker Trait Challenge - Part 1

 

 

A Comparison of Classification Paradigms for Speaker Likeability Determination
Nicholas Cummins,  Julien Epps,  Jia Min Karen Kua
University of New South Wales, Australia


 

Paper Identifier: Mon.SS1.09

Monday

15:00 - 15:10  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Predicting Likability of Speakers with Gaussian Processes
Dingchao Lu and Fei Sha
USC, United States


 

Paper Identifier: Mon.SS1.10

Monday

15:10 - 15:20  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Likability Classification – A Not so Deep Neural Network Approach
Raymond Brueckner and Björn Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Mon.SS1.11

Monday

15:20 - 15:30  Galleria

 

Speaker Trait Challenge - Part 1

 

 

Genetic Algorithm Based Feature Selection for Speaker Trait Classification
Dongrui Wu
GE Global Research


 

Paper Identifier: Mon.P1a.01

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Discrimination of Linguistic and Non-Linguistic Vocalizations in Spontaneous Speech: Intra- and Inter-Corpus Perspectives
Felix Weninger and Björn Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Mon.P1a.02

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Accentual Transfer from Swiss-German to French. A Study of “Français Fédéral”
Mathieu Avanzi1,  Pauline Dubosson1,  Sandra Schwab2,  Nicolas Obin3
1Neuchâtel University, Switzerland, 2University of Geneva, Switzerland, 3IRCAM-CNRS UMR 9912-STMS, Paris, France


 

Paper Identifier: Mon.P1a.03

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Phonology & the Interpretation of Fine Phonetic Detail in Berlin German
Stefanie Jannedy1 and Melanie Weirich2
1Center for General Linguistics (ZAS), Berlin, Germany, 2Friedrich-Schiller-Universität Jena, Germany


 

Paper Identifier: Mon.P1a.04

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Evaluation of a formant-based speech-driven lip motion generation
Carlos Ishi,  Chaoran Liu,  Hiroshi Ishiguro,  Norihiro Hagita
ATR, Japan


 

Paper Identifier: Mon.P1a.05

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Using spectral measures to differentiate Mandarin and Korean sibilant fricatives
Jeffrey Kallay and Jeffrey Holliday
The Ohio State University, USA


 

Paper Identifier: Mon.P1a.06

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

EFL Conversational Triads: Foreigner-directed Speech and Hyperarticulation
Hua-Li Jian1 and Richard Konopka2
1Faculty of Technology, Art and Design,Oslo and Akershus University College of Applied Sciences, Norway, 2Department of Foreign Languages and Literature, National Cheng Kung University, Taiwan


 

Paper Identifier: Mon.P1a.07

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

Syllable perception depends on tone perception
Iris Chuoying Ouyang and Khalil Iskarous
University of Southern California, USA


 

Paper Identifier: Mon.P1a.08: Presentation moved to Wed.O6c.04

 

Paper Identifier: Mon.P1a.09

Monday

13:30 - 15:30  Exhibition Hall

 

Phonetics and Phonology

 

 

How consonants, dialect and speech rate affect vowel devoicing?
Masako Fujimoto1,  Seiya Funatsu2,  Ichiro Fujimoto3
1National Istitute for Japanese Language and Linguistics, Japan, 2Prefectural University of Hiroshima, Japan, 3Takushoku Univ., Japan


 

Paper Identifier: Mon.P1b.01

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Distance-Dependent Noise Reduction for Two-Channel Microphones
Thomas Fehér,  Dietmar Richter,  Oliver Jokisch,  Rüdiger Hoffmann
Dresden University of Technology, Germany


 

Paper Identifier: Mon.P1b.02

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Direction of Arrival Estimation Based on Subband Weighting for Noisy Conditions
Wei Xue and Wenju Liu
Beijing 100190, China


 

 Paper Identifier: Mon.P1b.03

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Binaural Noise Reduction Using Frequency-Warped FIR Filters
Jorge Marin and David V. Anderson
Georgia Institute of Technology, USA


 

Paper Identifier: Mon.P1b.04

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Exploring Off Time Nature for Speech Enhancement
MENG YU and JACK XIN
University of California, Irvine, USA


 

Paper Identifier: Mon.P1b.05

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Model-based Single-Channel Dereverberation in Noisy Acoustical Environments
Xulei Bao and Jie Zhu
Shanghai Jiao Tong University, China


 

Paper Identifier: Mon.P1b.06

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

An Auditory Inspired Multimodal Framework for Speech Enhancement
Majid Mirbagheri1,  Sahar Akram1,  Shihab Shamma2
1Institue for System Research, University of Maryland College Park, USA, 2Institue for System Research, Department of Electrical and Computer Engineering, University of Maryland College Park, USA


 

Paper Identifier: Mon.P1b.07

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Binary Mask Estimation for Improved Speech Intelligibility in Reverberant Environments
Oldooz Hazrati,  Jaewook Lee,  Philipos Loizou
The University of Texas at Dallas, USA


 

Paper Identifier: Mon.P1b.08

Monday

13:30 - 15:30  Exhibition Hall

 

Enhancement

 

 

Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech
Petko N. Petkov1,  W. Bastiaan Kleijn2,  Gustav Eje Henter1
1KTH-Royal Institute of Technology, 2Victoria University of Wellington


 

Paper Identifier: Mon.P1c.01

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Morpheme Level Feature-based Language Models for German LVCSR
Amr El-Desoky Mousa,  M. Ali Basha Shaik,  Ralf Schlüter,  Hermann Ney
RWTH Aachen University, Germany


 

Paper Identifier: Mon.P1c.02

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Tied-State Mixture Language Model for WFST-based Speech Recognition
Hitoshi Yamamoto,  Paul R. Dixon,  Shigeki Matsuda,  Chiori Hori,  Hideki Kashioka
NICT, Japan


 

Paper Identifier: Mon.P1c.03

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Maximum Entropy Language Model Adaptation for Mobile Speech Input
Tanel Alumäe1 and Kaarel Kaljurand2
1Tallinn University of Technology, Estonia, 2University of Zurich, Switzerland


 

Paper Identifier: Mon.P1c.04

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Supervised and unsupervised Web-based language model domain adaptation
Gwénolé Lecorvé1,  John Dines2,  Thomas Hain3,  Petr Motlicek1
1Idiap Research Institute, Switzerland, 2Idiap Research Institute, Koemei, Switzerland, 3University of Sheffield, United Kingdom


 

Paper Identifier: Mon.P1c.05

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

A Hierarchical Bayesian Approach for Semi-supervised Discriminative Language Modeling
Yik-Cheung Tam and Paul Vozila
USA


 

Paper Identifier: Mon.P1c.06

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Leveraging Social Annotation for Topic Language Model Adaptation
Youzheng Wu,  Kazuhiko Abe,  Paul Dixon,  Chiori Hori,  Hideki Kashioka
NiCT


 

Paper Identifier: Mon.P1c.07

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

LSTM Neural Networks for Language Modeling
Martin Sundermeyer,  Ralf Schlüter,  Hermann Ney
RWTH Aachen University, Germany


 

Paper Identifier: Mon.P1c.08

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Phrasal Cohort Based Unsupervised Discriminative Language Modeling
Puyang Xu1,  Brian Roark2,  Sanjeev Khudanpur3
1Johns Hopkins University, USA, 2Oregon Health and Sciences University, USA, 3Johns Hopkins University


 

Paper Identifier: Mon.P1c.09

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Deriving conversation-based features from unlabeled speech for discriminative language modeling
Damianos Karakos1,  Brian Roark2,  Izhak Shafran2,  Kenji Sagae3,  Maider Lehr2,  Emily Prud'hommeaux2,  Puyang Xu1,  Nathan Glenn4,  Sanjeev Khudanpur1,  Murat Saraclar5,  Dan Bikel6,  Mark Dredze1,  Chris Callison-Burch1,  Yuan Cao1,  Keith Hall6,  Eva Hasler7,  Philip Koehn7,  Adam Lopez1,  Matt Post1,  Darcey Riley8
1JHU, 2OHSU, 3USC, 4BYU, 5Bogazici, 6Google, 7U of Edinburgh, 8U of Rochester


 

Paper Identifier: Mon.P1c.10

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language Modeling
Erinc Dikici,  Arda Celebi,  Murat Saraclar
Bogazici University, Turkey


 

Paper Identifier: Mon.P1c.11

Monday

13:30 - 15:30  Exhibition Hall

 

Language Modeling

 

 

On-the-fly Topic Adaptation for YouTube Video Transcription
Kapil Thadani1,  Fadi Biadsy2,  Dan Bikel2
1Columbia University, USA, 2Google, USA


 

Paper Identifier:Mon.P1d.01

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Portability of Semantic Annotations for Fast Development of Dialogue Corpora
Bassam Jabaian1,  Fabrice Lefèvre2,  Laurent Besacier3
1LIA, University of Avignon, Avignon, France, 2LIA, University of Avignon, Avignon, France, 3LIG, University of Joseph Fourrier, Grenoble, France


 

Paper Identifier: Mon.P1d.02

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Optimization of Dialog Strategies using Automatic Dialog Simulation and Statistical Dialog Management Techniques
Zoraida Callejas and Ramon Lopez-Cozar
University of Granada, Spain


 

Paper Identifier: Mon.P1d.03

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Preference-learning based Inverse Reinforcement Learning for Dialog Control
Hiroaki Sugiyama,  Toyomi Meguro,  Yasuhiro Minami
NTT Communication Science Laboratories


 

Paper Identifier:Mon.P1d.04

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

A Data-driven Approach to Understanding Spoken Route Directions in Human-Robot Dialogue
Raveesh Meena,  Gabriel Skantze,  Joakim Gustafson
KTH Speech, Music and Hearing, Sweden


 

Paper Identifier: Mon.P1d.05

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Detecting System-directed Utterances using Dialogue-level Features
Kazunori Komatani1,  Akira Hirano1,  Mikio Nakano2
1Nagoya University, 2Honda Research Institute Japan, Co., Ltd.


 

Paper Identifier: Mon.P1d.06

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

An Online Generated Transducer to Increase Dialog Manager Coverage
Joaquin Planells,  Lluís-F. Hurtado,  Emilio Sanchis,  Encarna Segarra
Universitat Politècnica de València, Spain


 

Paper Identifier: Mon.P1d.07

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

A Sequential Bayesian Dialog Agent for Computational Ethnography
Abe Kazemzadeh,  James Gibson,  Juanchen Li,  Sungbok Lee,  Panayiotis Georgiou,  Shrikanth Narayanan
University of Southern California, USA


 

Paper Identifier: Mon.P1d.08

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

ClippyScript: A Programming Language for Multi-Domain Dialogue Systems
Frank Seide and Sean McDirmid
Microsoft Research Asia, China


 

Paper Identifier: Mon.P1d.09

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Correlation Between Model-based Approximations of Grounding-related Cognition and User Judgments
Klaus-Peter Engelbrecht and Sebastian Möller
QU-Lab, Telekom Innovation Laboratories, TU Berlin, Germany


 

Paper Identifier: Mon.P1d.10 (Originally Thu.P9c.01)

Monday

13:30 - 15:30  Exhibition Hall

 

Spoken Language Understanding and Dialog

 

 

Spelling as a Complementary Strategy for Speech Recognition
Keith Vertanen1 and Per Ola Kristensson2
1Montana Tech of the University of Montana, USA, 2St Andrews, UK


 

Paper Identifier: Mon.O2a.01

Monday

16:00 - 16:20  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition
Kenichi Kumatani1,  Bhiksha Raj2,  Rita Singh2,  John McDonough2
1Disney Research, Pittsburgh, 2Carnegie Mellon University


 

Paper Identifier: Mon.O2a.02

Monday

16:20 - 16:40  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise
Felix Weninger,  Martin Wöllmer,  Björn Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Mon.O2a.03

Monday

16:40 - 17:00  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Noise Compensation for Subspace Gaussian Mixture Models
Liang Lu1,  KK Chin2,  Arnab Ghoshal1,  Steve Renals1
1University of Edinburgh, 2Toshiba Research Europe Ltd


 

Paper Identifier: Mon.O2a.04

Monday

17:00 - 17:20  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASR
Yang Sun1,  Mathew M. Doss2,  Jort F. Gemmeke3,  Bert Cranen1,  Louis ten Bosch1,  Lou Boves1
1Centre for Language and Speech Technology, 2Idiap Research Institute, 3Department ESAT, Katholieke Universiteit


 

Paper Identifier: Mon.O2a.05

Monday

17:20 - 17:40  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Sub-band based Log-energy and Its Dynamic Range Stretching for Robust In-car Speech Recognition
Weifeng Li1 and Herve Bourlard2
1Tsinghua University, China, 2Idiap Research Institute, Martigny, Switzerland


 

Paper Identifier: Mon.O2a.06

Monday

17:40 - 18:00  Grand Ballroom I

 

ASR: Noise Robustness

 

 

Subspace Gaussian Mixture Models Based on Noise Compensation for Speech Recognition
matrouf Driss1,  Georges Linares1,  Mickael Rouvier2,  Bouallegue Mohamed1
1University of Avignon , LIA, 2University of Le Mans, LIUM, France


 

Paper Identifier: Mon.O2b.01

Monday

16:00 - 16:20  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

"Help Me, I Need More User Tests!" User Simulations as Supportive Tool in the Development Process of Spoken Dialogue Systems
Florian Kretzschmar and Sebastian Möller
Quality and Usability Lab, Telekom Innovation Laboratories, TU Berlin, Germany


 

Paper Identifier: Mon.O2b.02

Monday

16:20 - 16:40  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

Caller Response Timing Patterns in Spoken Dialog Systems
Silke Witt
Fluential, USA


 

Paper Identifier: Mon.O2b.03

Monday

16:40 - 17:00  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog System
Dilek Hakkani-Tur,  Gokhan Tur,  Larry Heck,  Ashley Fidler, Asli Celikilmaz
Microsoft, USA


 

Paper Identifier: Mon.O2b.04

Monday

17:00 - 17:20  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog
Elizabeth Shriberg1,  Andreas Stolcke1,  Dilek Hakkani-Tür1,  Larry Heck2
1Microsoft, U.S.A., 2Microsoft. U.S.A.


 

Paper Identifier: Mon.O2b.05

Monday

17:20 - 17:40  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing
Gokhan Tur,  Minwoo Jeong,  Ye-Yi Wang,  Dilek Hakkani-Tür,  Larry Heck
Microsoft, USA


 

Paper Identifier: Mon.O2b.06

Monday

17:40 - 18:00  Grand Ballroom II

 

Spoken Language Understanding and Dialog II

 

 

Prosodic Entrainment in an Information-Driven Dialog System
Andrew Fandrianto and Maxine Eskenazi
Carnegie Mellon University, USA


 

Paper Identifier: Mon.Oc.01

Monday

16:00 - 16:20  Pavilion East

 

Paralinguistics I

 

 

Novel Metrics of Speech Rhythm for the Assessment of Emotion
Fabien Ringeval1,  Mohamed Chetouani2,  Björn Schuller3
1University of Fribourg, Switzerland, 2Université Pierre et Marie Curie - Paris 6, France, 3Technische Universität München, Germany


 

Paper Identifier: Mon.Oc.02

Monday

16:20 - 16:40  Pavilion East

 

Paralinguistics I

 

 

Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings
Martin Woellmer,  Florian Eyben,  Bjoern Schuller,  Gerhard Rigoll
Technische Universitaet Muenchen, Germany


 

Paper Identifier: Mon.Oc.03

Monday

16:40 - 17:00  Pavilion East

 

Paralinguistics I

 

 

Audiovisual correlates of basic emotions in blind and sighted people
Marc Swerts,  Kitty Leuverink,  Madelène Munnik,  Vera Nijveld
Tilburg University, The Netherlands


 

Paper Identifier: Mon.Oc.04

Monday

17:00 - 17:20  Pavilion East

 

Paralinguistics I

 

 

Combining Ranking and Classification to Improve Emotion Recognition in Spontaneous Speech
Houwei Cao,  Ragini Verma,  Ani Nenkova
University of Pennsylvania, United States


 

Paper Identifier: Mon.Oc.05

Monday

17:20 - 17:40  Pavilion East

 

Paralinguistics I

 

 

Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition
Zixing Zhang and Björn Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Mon.Oc.06

Monday

17:40 - 18:00  Pavilion East

 

Paralinguistics I

 

 

Emotion Recognition using Acoustic and Lexical Features
Viktor Rozgic1,  Sankaranarayanan Ananthakrishnan1,  Shirin Saleem1,  Rohit Kumar1,  Aravind Vembu2,  Rohit Prasad1
1Raytheon BBN Technologies, USA, 2University of Southern California, USA


 

Paper Identifier: Mon.O2d.01

Monday

16:00 - 16:20  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis
Phillip De Leon1,  Bryan Stewart1,  Junichi Yamagishi2
1New Mexico State University, USA, 2University of Edinburgh, UK


 

Paper Identifier: Mon.O2d.02

Monday

16:20 - 16:40  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis
Wen Zhengqi1,  Hideki Kawahara2,  Tao Jianhua1
1Institute of Automation, Chinese Academy of Sciences, Beijing, China, 2Faculty of Systems Engineering, Wakayama University, Wakayama, Japan


 

Paper Identifier: Mon.O2d.03

Monday

16:40 - 17:00  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

Robust Pitch Estimation Using l1-regularized Maximum Likelihood Estimation
Feng Huang and Tan Lee
The Chinese University of Hong Kong


 

Paper Identifier: Mon.O2d.04

Monday

17:00 - 17:20  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

A full-band adaptive harmonic representation of speech
Gilles Degottex and Yannis Stylianou
UOC-CSD/FORTH-ICS


 

Paper Identifier: Mon.O2d.05

Monday

17:20 - 17:40  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulation
Hideki Kawahara1,  Masanori Morise2,  Ryuichi Nisimura1,  Toshio Irino1
1Wakayama University, Japan, 2Ritsumeikan University, Japan


 

Paper Identifier: Mon.O2d.06

Monday

17:40 - 18:00  Pavilion West

 

Pitch and HarMondayic Analysis

 

 

Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech
Kota Yoshizato1,  Hirokazu Kameoka2,  Daisuke Saito1,  Shigeki Sagayama1
1Graduate School of Information Science and Technology, The University of Tokyo, Japan, 2Graduate School of Information Science and Technology, The University of Tokyo / NTT Communication Science Laboratories, NTT Corporation, Japan


 

Paper Identifier: Mon.SS2.01

Monday

16:00 - 16:10  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Is ‘not bad’ good enough? Aspects of unknown voices’ likability
Benjamin Weiss1 and Felix Burkhardt2
1Technische Universität Berlin, Germany, 2Telekom Innovation Laboratories, Germany


 

Paper Identifier: Mon.SS2.02

Monday

16:10 - 16:20  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification
Michelle Hewlett Sanchez,  Aaron Lawson,  Dimitra Vergyri,  Harry Bratt
SRI International, United States


 

Paper Identifier: Mon.SS2.03

Monday

16:20 - 16:30  Galleria

 

Speaker Trait Challenge - Part 2

 

 

The log-Gabor method: speech classification using spectrogram image analysis
Harm Buisman and Eric Postma
Tilburg University, The Netherlands


 

Paper Identifier: Mon.SS2.04

Monday

16:30 - 16:40  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Anchor Models and WCCN Normalization For Speaker Trait Classification
Yazid Attabi and Pierre Dumouchel
École de technologie supérieure, Montréal, Canada


 

Paper Identifier: Mon.SS2.05

Monday

16:40 - 16:50  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Pitch and Intonation Contribution to Speakers’ Traits Classification
Claude Montacié1 and Marie-José Caraty2
1Paris Sorbonne university - STIH, France, 2Paris Descartes University - LIPADE, France


 

Paper Identifier: Mon.SS2.06

Monday

16:50 - 17:00  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Text-dependent pathological voice detection
Gopala Krishna Anumanchipalli1,  Hugo Meinedo2,  Miguel Bugalho2,  Isabel Trancoso3,  Luis Oliveira3,  Alan Black1
1LTI/CMU, USA, 2INESC-ID, Portugal, 3INESC-ID/IST, Portugal


 

Paper Identifier: Mon.SS2.07

Monday

17:00 - 17:10  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Intelligibility classification of pathological speech using fusion of multiple high level descriptors
Jangwon Kim,  Naveen Kumar,  Andreas Tsiartas,  Ming Li,  Shrikanth Narayanan
University of Southern California, U.S.A.


 

Paper Identifier: Mon.SS2.08

Monday

17:10 - 17:20  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Interspeech Pathology Challenge: Investigations into Speaker and Sentence Specific Effects
Anthony Stark,  Alireza Bayestehtashk,  Meysam Asgari,  Izhak Shafran
USA


 

Paper Identifier: Mon.SS2.09

Monday

17:20 - 17:30  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations
Xinhui Zhou1,  Daniel Garcia-Romero1,  Nima Mesgarani1,  Maureen Stone2,  Carol Espy-Wilson1,  Shihab Shamma1
1University of Maryland, College Park, USA, 2University of Maryland Dental School, USA


 

Paper Identifier: Mon.SS2.10

Monday

17:30 - 17:40  Galleria

 

Speaker Trait Challenge - Part 2

 

 

Detecting Intelligibility by Linear Dimensionality Reduction and Normalized Voice Quality Hierarchical Features
Dong-Yan Huang,  Yongwei Zhu,  Dajun Wu,  Rongshan Yu
Institute for Infocomm Research


 

Paper Identifier: Mon.P2a.01

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Extrinsic normalization for vocal tracts depends on the signal, not on attention
Matthias Sjerps1,  James McQueen2,  Holger Mitterer1
1Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands, 2Behavioural Science Institute and Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Radboud University Nijmegen, Nijmegen, The Netherlands


 

Paper Identifier: Mon.P2a.03

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers
Hiroaki Hatano1,  Tatsuya Kitamura1,  Hironori Takemoto2,  Parham Mokhtari2,  Kiyoshi Honda3,  Shinobu Masaki4
1Faculty of Intelligence and Informatics, Konan University, Japan, 2Universal Communication Research Institute, NICT, Japan, 3University of Paris III, France, 4ATR/ATR-Promotions, Japan


 

Paper Identifier: Mon.P2a.06

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Contribution of Spectral Shapes to Tone Perception
Natthawut Kertkeidkachorn,  Surapol Vorapatratorn,  Sirinart Tangruamsub,  Proadpran Punyabukkana,  Atiwong Suchato
Thailand


 

Paper Identifier: Mon.P2a.08

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Pitch and phonological perception of tone in the Suruí language of Rondônia (Brazil): identification task of LHL and LHH tonal patterns
Julien Meyer
1) Área de Linguística, Museu Goeldi, Ministério de Ciência, Tecnologia e Inovação, Brasil; 2) Sound Communication and environmental auditory Perception Research Group, France


 

Paper Identifier: Mon.P2a.09

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

The Role of Creaky Voice in Mandarin Tone 2 and Tone 3 Perception
Rui Cao,  Ratree Wayland,  Edith Kaan
Department of Linguistics, University of Florida, U.S.A.


 

Paper Identifier: Mon.P2a.04

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Detection of Transition Segments in VCV Utterances for Estimation of the Place of Closure of Oral Stops for Speech Training
K. S. Nataraj and P. C. Pandey
Indian Institute of Technology Bombay, India


 

Paper Identifier: Mon.P2a.02

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Perceptual Learning of /f/-/s/ by Older Listeners
Odette Scharenborg1,  Esther Janse2,  Andrea Weber1
1Max Planck Institute for Psycholinguistics, the Netherlands, 2Centre for Language Studies, Radboud University Nijmegen, The Netherlands


 

Paper Identifier: Mon.P2a.05

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Audiovisual discrimination of CV syllables: a simultaneous fMRI-EEG study
Cyril Dubois1 and Rudolph Sock2
1University of Zürich, Switzerland, 2University of Strasbourg, France


 

Paper Identifier: Mon.P2a.07

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Methodological Issues in Assessing Perceptual Representation of Consonant Sounds in Thai
Charturong Tantibundhit1,  Chutamanee Onsuwan1,  P. Phienphanich1,  Chai Wutiwiwatchai2
1Thammasat University, Thailand, 2NECTEC


 

Paper Identifier: Mon.P2a.10

Monday

16:00 - 18:00  Exhibition Hall

 

Perceptual Learning and Perceptual Cues to Segments and Tones

 

 

Can litheners retune native categories acroth a thoneme boundary?
Michael Tyler and Mona Faris
University of Western Sydney, Australia


Paper Identifier: Mon.P2b.01

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Synthetic F0 Can Effectively Convey Speaker ID in Delexicalized Speech
Eric Morley,  Esther Klabbers,  Jan van Santen,  Alexander Kain,  Seyed Hamidreza Mohammadi
CSLU/Oregon Health & Science University, USA


 

Paper Identifier: Mon.P2b.02

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Evaluating Prosodic Processing for Incremental Speech Synthesis
Timo Baumann1 and David Schlangen2
1University of Hamburg, Germany, 2University of Bielefeld, Germany


 

Paper Identifier: Mon.P2b.03

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Expressing Speaker's Intentions through Sentence-Final Intonations for Japanese Conversational Speech Synthesis
Kazuhiko Iwata and Tetsunori Kobayashi
Waseda University, Japan


 

Paper Identifier: Mon.P2b.04

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Modeling Pause-Duration for Style-Specific Speech Synthesis
Alok Parlikar and Alan W Black
Carnegie Mellon University, USA


 

Paper Identifier: Mon.P2b.05

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Enumerating Differences Between Various Communicative Functions for Purposes of Czech Expressive Speech Synthesis in Limited Domain
Martin Gruber
University of West Bohemia


 

Paper Identifier: Mon.P2b.06

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals
Christoph Norrenbrock1,  Florian Hinterleitner2,  Ulrich Heute1,  Sebastian Möller2
1DSS, University of Kiel, Germany, 2QU Lab, TU Berlin, Germany


 

Paper Identifier: Mon.P2b.07

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis
Hiroya Hashimoto,  Keikichi Hirose,  Nobuaki Minematsu
The University of Tokyo


 

Paper Identifier: Mon.P2b.08

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Discontinuous Observation HMM for Prosodic-Event-Based F0 Generation
Tomoki Koriyama,  Takashi Nose,  Takao Kobayashi
Tokyo Institute of Technology, Japan


 

Paper Identifier: Mon.P2b.09

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training Data
Fanbo Meng1,  Zhiyong Wu2,  Helen Meng3,  Jia Jia1,  Lianhong Cai1
1Department of Computer Science and Technology, Tsinghua University, China, 2Graduate School at Shenzhen, Tsinghua University, China, 3Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong, China


 

Paper Identifier: Mon.P2b.10

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

Employing Sentence Structure: Syntax Trees as Prosody Generators
Sarah Hoffmann and Beat Pfister
ETH Zurich, Switzerland


 

Paper Identifier: Mon.P2b.11

Monday

16:00 - 18:00  Exhibition Hall

 

Speech Synthesis: Prosody

 

 

A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components
Yasunori Ohishi1,  Hirokazu Kameoka1,  Daichi Mochihashi2,  Kunio Kashino1
1NTT Communication Science Laboratories, NTT Corporation, 2The Institute of Statistical Mathematics


 

Paper Identifier: Mon.P2c.01

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription
Jan Silovsky,  Petr Cerva,  Jindrich Zdansky,  Jan Nouza
Technical University of Liberec


 

Paper Identifier: Mon.P2c.02

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

On the Use of Spectral and Iterative Methods for Speaker Diarization
Stephen Shum,  Najim Dehak,  Jim Glass
MIT Computer Science and Artificial Intelligence Laboratory, USA


 

Paper Identifier: Mon.P2c.03

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Where did I go wrong?: Identifying troublesome segments for speaker diarization systems
Mary Tai Knox,  Nikki Mirghafori,  Gerald Friedland
International Computer Science Institute, USA


 

Paper Identifier: Mon.P2c.04

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Speaker diarization of overlapping speech based on silence distribution in meeting recordings
Sree Harsha Yella and Fabio Valente
Idiap Research Institute, Switzerland


 

Paper Identifier: Mon.P2c.05

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Phone Adaptive Training for Speaker Diarization
Simon Bozonnet,  Ravichander Vipperla,  Nicholas Evans
EURECOM,FRANCE


 

Paper Identifier: Mon.P2c.06

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Compensating for Ageing and Quality variation in Speaker Verification
Finnian Kelly1,  Andrzej Drygajlo2,  Naomi Harte1
1Trinity College Dublin, Ireland, 2Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland


 

Paper Identifier: Mon.P2c.07

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Calibration of probabilistic age recognition
David van Leeuwen1 and Mohamad Hasan Bahari2
1Radboud University Nijmegen, The Netherlands, 2Katholieke Universiteit Leuven, Belgium


 

Paper Identifier: Mon.P2c.08

Monday

16:00 - 18:00  Exhibition Hall

 

Speaker Diarization and Age Recognition

 

 

Age Estimation from Telephone Speech using i-vectors
Mohamad Hasan Bahari1,  Mitchell McLaren2,  Hugo Van hamme1,  David Van Leeuwen2
1KU Leuven, 2Radboud University Nijmegen


 

Paper Identifier: Tue.O3a.01

Tuesday

10:00 - 10:20  Grand Ballroom I

 

ASR: Discriminative Training

 

 

A factorized representation of FMLLR transform based on QR-decomposition
Shakti P. Rath,  Martin Karafiat,  Ondrej Glembek,  Jan ``Honza'' Cernocky
Brno University of Technology


 

Paper Identifier: Tue.O3a.02

Tuesday

10:20 - 10:40  Grand Ballroom I

 

ASR: Discriminative Training

 

 

A Correlational Discriminant Approach to Feature Extraction for Robust Speech Recognition
Vikrant Singh Tomar and Richard C. Rose
McGill University, Canada


 

Paper Identifier: Tue.O3a.03

Tuesday

10:40 - 11:00  Grand Ballroom I

 

ASR: Discriminative Training

 

 

Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech
Chao Weng1,  Biing-Hwang (Fred) Juang1,  Daniel Povey2
1Center for Signal and Image Processing, Georgia Institute of Technology, Atlanta, GA, USA, 2Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA


 

Paper Identifier: Tue.O3a.04

Tuesday

11:00 - 11:20  Grand Ballroom I

 

ASR: Discriminative Training

 

 

Discriminative Reranking for LVCSR Leveraging Invariant Structure
Masayuki Suzuki1,  Gakuto Kurata2,  Masafumi Nishimura2,  Nobuaki Minematsu1
1The University of Tokyo, Japan, 2IBM Research - Tokyo, Japan


 

Paper Identifier: Tue.O3a.05

Tuesday

11:20 - 11:40  Grand Ballroom I

 

ASR: Discriminative Training

 

 

Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation
Ting-yao Hu1,  Yu Tsao2,  Lin-shan Lee1
1National Taiwan University, Taiwan, 2Academia Sinica, Taiwan


 

Paper Identifier: Tue.O3a.06

Tuesday

11:40 - 12:00  Grand Ballroom I

 

ASR: Discriminative Training

 

 

Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition
Muhammad Ali Tahir,  Markus Nussbaum-Thom,  Ralf Schlueter,  Hermann Ney
RWTH Aachen, Germany


 

Paper Identifier: Tue.O3b.01

Tuesday

10:00 - 10:20  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Low-SNR, Speaker-Dependent Speech Enhancement using GMMs and MFCCs
Laura Boucheron and Phillip De Leon
New Mexico State University, USA


 

Paper Identifier: Tue.O3b.02

Tuesday

10:20 - 10:40  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Can modified casual speech reach the intelligibility of clear speech?
MARIA KOUTSOGIANNAKI1,  MICHELE PETTINATO2,  CASSIE MAYO3,  VARVARA KANDIA1,  YANNIS STYLIANOU1
1Institute of Computer Science, FORTH, and Multimedia Informatics Lab, CSD, UoC, Greece, 2Speech, Hearing and Phonetic Sciences, University College London, UK, 3Centre for Speech Technology Research, the University of Edinburgh, UK


 

Paper Identifier: Tue.O3b.03

Tuesday

10:40 - 11:00  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis Adaptation
Michael Carlin1,  Nicolas Malyska2,  Thomas Quatieri2
1Johns Hopkins University, USA, 2MIT Lincoln Laboratory, USA


 

Paper Identifier: Tue.O3b.04

Tuesday

11:00 - 11:20  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Inventory-Based Audio-Visual Speech Enhancement
Dorothea Kolossa1,  Robert Nickel2,  Steffen Zeiler1,  Rainer Martin1
1Ruhr-Universität Bochum, Germany, 2Bucknell University, USA


 

Paper Identifier: Tue.O3b.05

Tuesday

11:20 - 11:40  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speech
Emma Jokinen1,  Paavo Alku2,  Martti Vainio3
1Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland, 2Institute of Behavioural Sciences, University of HDepartment of Signal Processing and Acoustics, Aalto University, Espoo, Finlandelsinki, Helsinki, Finland, 3Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland


 

Paper Identifier: Tue.O3b.06

Tuesday

11:40 - 12:00  Grand Ballroom II

 

Single Channel Speech Enhancement

 

 

Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments
Zhiyao Duan1,  Gautham J. Mysore2,  Paris Smaragdis3
1Northwestern Unviersity, USA, 2Adobe Systems Inc., 3University of Illinois at Urbana-Champaign


 

Paper Identifier: Tue.O3c.01

Tuesday

10:00 - 10:20  Pavilion East

 

Conversation and Interaction I

 

 

Phoneme resistance during speech-in-speech comprehension
Léo Varnet1,  Julien Meyer2,  Michel Hoen1,  Fanny Meunier1
1CNRS, France, 2Museu Goeldi, Brasil


 

Paper Identifier: Tue.O3c.02

Tuesday

10:20 - 10:40  Pavilion East

 

Conversation and Interaction I

 

 

Smile with a smile
Hugo Quené and Will Schuerman
Utrecht University, The Netherlands


 

Paper Identifier: Tue.O3c.03

Tuesday

10:40 - 11:00  Pavilion East

 

Conversation and Interaction I

 

 

Interactions Between Turn-taking Gaps, Disfluencies and Social Obligation
Rebecca Lunsford,  Peter A. Heeman,  Jan P. H. van Santen
OHSU, USA


 

Paper Identifier: Tue.O3c.04

Tuesday

11:00 - 11:20  Pavilion East

 

Conversation and Interaction I

 

 

Effect of being seen on the production of visible speech cues. A pilot study on Lombard speech
Maëva GARNIER1,  Lucie MENARD2,  Gabrielle RICHARD2
1Speech and Cognition Department, GIPSA-Lab, UMR CNRS 5216 & Grenoble Universités, France, 2Laboratoire de phonétique, Université du Québec à Montréal, Canada


 

Paper Identifier: Tue.O3c.05

Tuesday

11:20 - 11:40  Pavilion East

 

Conversation and Interaction I

 

 

Temporal entrainment in overlapped speech: Cross-linguistic study
Marcin Wlodarczak,  Juraj Simko,  Petra Wagner
Bielefeld University, Germany


 

Paper Identifier: Tue.O3c.06

Tuesday

11:40 - 12:00  Pavilion East

 

Conversation and Interaction I

 

 

Based on Isolated Saliency or Causal Integration? Toward a Better Understanding of Human Annotation Process using Multiple Instance Learning and Sequential Probability Ratio Test
Chi-Chun Lee,  Athanasios Katsamanis,  Panayiotis Georgiou,  Shrikanth Narayanan
Signal Analysis and Interpretation Laboratory, University of Southern California, USA


 

Paper Identifier: Tue.O3d.01

Tuesday

10:00 - 10:20  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Text-To-Speech Intelligibility Across Speech Rates
Ann Syrdal1,  H. Timothy Bunnell2,  Susan Hertz3,  Taniya Mishra1,  Murray Spiegel4,  Corine Bickley5,  Deborah Rekart6,  Matthew Makashay7
1AT&T Labs - Research, USA, 2Nemours Biomedical Research, 3NovaSpeech LLC and Cornell University, 4Applied Communication Sciences, 5Gallaudet University, 6AT&T Services, 7Walter Reed National Military Medical Center


 

Paper Identifier: Tue.O3d.02

Tuesday

10:20 - 10:40  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability
Linfang Wang1,  Lijuan Wang2,  Yan Teng3,  Zhe Geng1,  Frank Soong2
1Micrsoft Bing, Beijing, China, 2Microsft Research Asia, Beijing, China, 3Micrsoft Bing, Redmond, WA, U.S.


 

Paper Identifier: Tue.O3d.03

Tuesday

10:40 - 11:00  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise
Cassia Valentini-Botinhao,  Junichi Yamagishi,  Simon King
University of Edinburgh, United Kingdom


 

Paper Identifier: Tue.O3d.04

Tuesday

11:00 - 11:20  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression
Tudor-Catalin Zorila1,  Varvara Kandia2,  Yannis Stylianou2
1UPB, Romania, 2ICS-FORTH, Greece


 

Paper Identifier: Tue.O3d.05

Tuesday

11:20 - 11:40  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Implementation of Simple Spectral Techniques to Enhance the Intelligibility of Speech using a Harmonic Model
Daniel Erro1,  Yannis Stylianou2,  Eva Navas1,  Inma Hernaez1
1UPV/EHU, Spain, 2UoC & FORTH, Greece


 

Paper Identifier: Tue.O3d.06

Tuesday

11:40 - 12:00  Pavilion West

 

Speech Synthesis: Intelligibility

 

 

Making Conversational Vowels More Clear
Seyed Hamidreza Mohammadi,  Alexander Kain,  Jan van Santen
OHSU, USA


 

Paper Identifier: Tue.SS3.01

Tuesday

10:00 - 10:20  Galleria

 

Speech and Language Technologies for STEM

 

 

Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues
Diane Litman,  Heather Friedberg,  Kate Forbes-Riley
University of Pittsburgh, US


 

Paper Identifier: Tue.SS3.02

Tuesday

10:20 - 10:40  Galleria

 

Speech and Language Technologies for STEM

 

 

Spoken Dialogs With a Virtual Science Tutor
Wayne Ward,  Daniel Bolanos,  Ronald Cole
Boulder Language Technologies, Boulder, CO, USA


 

Paper Identifier: Tue.SS3.03

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

Real-Time Lecture Transcription using ASR for Czech Hearing Impaired or Deaf Students
Petr Cerva,  Jan Silovsky,  Jindrich Zdansky,  Jan Nouza,  Jiri Malek
Technical University of Liberec, Czech Republic


 

Paper Identifier: Tue.SS3.04

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

Application of Structural Events Detected on ASR Outputs for Automated Speaking Assessment
Lei Chen and Su-Youn Yoon
ETS


 

Paper Identifier: Tue.SS3.05

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

Addressing Confusions in Spoken Language in ESL Pronunciation Tutors
Oscar Saz and Maxine Eskenazi
Carnegie Mellon University, USA


 

Paper Identifier: Tue.SS3.06

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training
Xiaojun Qian1,  Helen Meng1,  Frank Soong2
1Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong SAR, China, 2Speech Group, Microsoft Research Asia, Beijing, China


 

Paper Identifier: Tue.SS3.07

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

Practice and feedback in L2 speaking: an evaluation of the DISCO CALL system
Catia Cucchiarini,  Joost van Doremalen,  Helmer Strik
Centre for Language and Speech Technology, Radboud University, Nijmegen, The Netherlands


 

Paper Identifier: Tue.SS3.08

Tuesday

10:40 - 12:00  Galleria

 

Speech and Language Technologies for STEM

 

 

Cross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HMM for Pronunciation Training
Thomas Hueber1,  Atef Ben-Youssef2,  Gérard Bailly2,  Pierre Badin2,  Frédéric Eliséi2
1GIPSA-lab/CNRS, France, 2GIPSA-lab, France


 

Paper Identifier: Tue.P3a.01

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Naturalness Judgement of Prosodic Variation of Japanese Utterances with Prosody Modified Stimuli
Chiharu Tsurutani1 and Shunichi Ishihara2
1Griffith University, Australia, 2Australian National University, Australia


 

Paper Identifier: Tue.P3a.02

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Effects of Dialectal Origin on Articulation Rate in French
Mathieu Avanzi1,  Pauline Dubosson1,  Sandra Schwab2
1University of Neuchâtel, Neuchâtel, Switzerland, 2University of Geneva, Switzerland


 

Paper Identifier: Tue.P3a.03

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

A New Approach of Speaking Rate Modeling for Mandarin Speech Prosody
Chiao-Hua Hsieh1,  Chen-Yu Chiang1,  Yih-Ru Wang2,  Hsiu-Min Yu3,  Sin-Horng Chen1
1Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan, 2National Chiao Tung University, Taiwan, 3Language Center, Chung Hua University, Taiwan


 

Paper Identifier: Tue.P3a.04

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Modelling pause duration as a function of contextual length
David Doukhan,  Albert Rilliard,  Sophie Rosset,  Christophe D'Alessandro
LIMSI-CNRS, UPR 3251, FRANCE


 

Paper Identifier: Tue.P3a.05

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Production and Perception of Focus in PFC and non-PFC Languages: Comparing Beijing Mandarin and Hainan Tsat
Bei Wang1,  Chenxia Li1,  Qian Wu1,  Xiaxia Zhang1,  Baofeng Wang1,  Yi Xu2
1Minzu University of China, 2University College London


 

Paper Identifier: Tue.P3a.06

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Prosodic Realization of Focus in Statement and Question in Tibetan (Lhasa Dialect)
Xiaxia Zhang1,  Bei Wang1,  Qian Wu1,  Yi Xu2
1Minzu University of China, China, 2University College London, China


 

Paper Identifier: Tue.P3a.07

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Effect of noise type and level on focus related fundamental frequency changes
Martti Vainio1,  Daniel Aalto1,  Antti Suni1,  Anja Arnhold2,  Tuomo Raitio3,  Henri Seijo3,  Juhani Järvikivi4,  Paavo Alku3
1University of Helsinki, Finland, 2Goethe-University Frankfurt am Main, Germany, 3Aalto University, Finland, 4NTNU, Norway


 

Paper Identifier: Tue.P3a.08

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Role of Prosody in Automatic Modality Recognition of Bangla Speech
Anal Warsi,  Tulika Basu,  Debasis Mazumdar
Centre for Developmet of Advanced Computing (C-DAC), Kolkata, India


 

Paper Identifier: Tue.P3a.09

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

Where to associate stressed additive particles? Evidence from speech prosody
Bettina Braun
University of Konstanz


 

Paper Identifier: Tue.P3a.10

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

From PVI to Perception: A Return to the Roots of Rhythm in Broadcast News
Matthew Benton
The University of Texas at Arlington, USA


 

Paper Identifier: Tue.P3a.11

Tuesday

10:00 - 12:00  Exhibition Hall

 

Prosody I

 

 

A methodology for the study of rhythm in drummed forms of languages: application to Bora Manguaré of Amazon
Julien Meyer1,  Laure Dentel2,  Frank Seifart3
11) Área de Linguística, Museu Goeldi, Ministério de Ciência, Tecnologia e Inovação, Brazil; 2) Sound Communication and environmental auditory Perception Research Group, France, 21) Área de Linguística, Museu Goeldi, Ministério de Ciência, Tecnologia e Inovação, Brazil; 2) Sound Communication and environmental auditory Perception Research Group, France; 3) Engenharia Elétrica, Universidade Federal do Pará (UFPA), Brazil, 3Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany


 

Paper Identifier: Tue.P3b.01

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Automatic Detection of High Vocal Effort in Telephone Speech
Jouni Pohjalainen,  Tuomo Raitio,  Hannu Pulakka,  Paavo Alku
Aalto University, Finland


 

Paper Identifier: Tue.P3b.02

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Analysis of Mimicry Speech
Gomathi D,  Sathya Adithya Thati,  Karthik Venkat Sridaran,  Yegnanarayana B
International Institute of Information Technology, Hyderabad, India


 

Paper Identifier: Tue.P3b.03

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Estimation of the vocal tract shape of nasals using a Bayesian scheme
Christian H. Kasess,  Wolfgang Kreuzer,  Ewald Enzinger,  Nadja Kerschhofer-Puhalo
Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria


 

Paper Identifier: Tue.P3b.04

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Advances in combined electro-optical palatography
Peter Birkholz,  Philippe Daechert,  Christiane Neuschaefer-Rube
Clinic of Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen, Aachen, Germany


 

Paper Identifier: Tue.P3b.05

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Noise Robust Pitch Tracking by Subband Autocorrelation Classification
Byung Suk Lee and Daniel P. W. Ellis
Columbia University, USA


 

Paper Identifier: Tue.P3b.06

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Inference of Critical Articulator Position for Fricative Consonants
Alexander Sepulveda1,  Rodrigo Capobianco-Guido2,  German Castellanos-Dominguez1
1Universidad Nacional de Colombia, Manizales, Colombia, 2University of S~ao Paulo (USP), S~ao Carlos, Brazil


 

Paper Identifier: Tue.P3b.07

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Vocal Tremor Measurement Based on Autocorrelation of Contours
Markus Brückl
Technische Universität Berlin, Germany


 

Paper Identifier: Tue.P3b.08

Tuesday

10:00 - 12:00  Exhibition Hall

 

Speech Analysis

 

 

Model-based Duration-difference Approach on Accent Evaluation of L2 Learner
Chatchawarn Hansakunbuntheung,  Ananlada Chotimongkol,  Sumonmas Thatphithakkul,  Patcharika Chootrakool
National Electronics and Computer Technology Center, Thailand


 

Paper Identifier: Tue.P3c.01

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface
Thomas Hueber1,  Gérard Bailly2,  Bruce Denby3
1GIPSA-lab/CNRS, France, 2GIPSA-lab, France, 3UPMC/ESPCI-ParisTech, France


 

Paper Identifier: Tue.P3c.02

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations
Tatsuya Kawahara,  Takuma Iwatate,  Katsuya Takanashi
Kyoto University, Japan


 

Paper Identifier: Tue.P3c.03

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Using Quality Ratings to Predict Modality Choice in Multimodal Systems
Ina Wechsung,  Klaus-Peter Engelbrecht,  Sebastian Möller
QU Lab, Telekom Innovation Laboratories, TU Berlin, Germany


 

Paper Identifier: Tue.P3c.04

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

HMM Based Continuous EOG Recognition for Eye-input Speech Interface
Fuming Fang1,  Takahiro Shinozaki1,  Yasuo Horiuchi1,  Shingo Kuroiwa1,  Sadaoki Furui2,  Toshimitsu Musha3
1Chiba University, Japan, 2Tokyo Institute of Technology, Japan, 3Brain Functions Laboratory Inc., Japan


 

Paper Identifier: Tue.P3c.05

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

A Random, Semantically Appropriate Sentence Generator for Speaker Verification
Jason Lilley1,  Amanda Stent2,  Ilija Zeljkovic2
1University of Delaware, USA, 2AT&T Labs - Research, USA


 

Paper Identifier: Tue.P3c.06

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Coherent Topic Transition in a Conversational Agent
Daniel Macias-Galindo,  Wilson Wong,  Lawrence Cavedon,  John Thangarajah
RMIT University, Melbourne, Australia


 

Paper Identifier: Tue.P3c.07

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence
Peter Heeman,  Jordan Fryer,  Rebecca Lunsford,  Andrew Rueckert,  Ethan Selfridge
OHSU, USA


 

Paper Identifier: Tue.P3c.08

Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Enhancing Speech Understanding in Spoken Dialogue Systems by Means of a New Frame-Correction Technique
Ramon Lopez-Cozar1,  Zoraida Callejas1,  David Griol2
1University of Granada, Spain, 2Carlos III University of Madrid, Spain


 

Paper Identifier: Tue.P3c.09 (Originally Mon.P1d.10)

 
Tuesday

10:00 - 12:00  Exhibition Hall

 

Dialog Systems

 

 

Assessment of user simulators for spoken dialogue systems by means of subspace multidimensional clustering
Zoraida Callejas1,  David Griol2,  Klaus-Peter Engelbrecht3
1University of Granada, Spain, 2University Carlos III of Madrid, Spain, 3Deutsche Telekom Laboratories, TU Berlin, Germany

 

 


 

 

Paper Identifier: Tue.O4a.01

Tuesday

13:30 - 13:50  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors
Keith Kintzley,  Aren Jansen,  Hynek Hermansky
Johns Hopkins University


 

Paper Identifier: Tue.O4a.02

Tuesday

13:50 - 14:10  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

Data-driven Posterior Features for Low Resource Speech Recognition Applications
Samuel Thomas,  Sriram Ganapathy,  Aren Jansen,  Hynek Hermansky
JHU, USA


 

Paper Identifier: Tue.O4a.03

Tuesday

14:10 - 14:30  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

Sparse Bayesian Factor Analysis for Stereo-based Stochastic Mapping
Xiaodong Cui1,  Mohamed Afify2,  George Saon1,  Vaibhava Goel1
1IBM T. J. Watson Research Center, USA, 2Orange Lab, Egypt


 

Paper Identifier: Tue.O4a.04

Tuesday

14:30 - 14:50  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

Word Discovery with Beta Process Factor Analysis
Niklas Vanhainen and Giampiero Salvi
KTH, Sweden


 

Paper Identifier: Tue.O4a.05

Tuesday

14:50 - 15:10  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space
Seong-Jun Hahm,  Atsunori Ogawa,  Masakiyo Fujimoto,  Takaaki Hori,  Atsushi Nakamura
NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan


 

Paper Identifier: Tue.O4a.06

Tuesday

15:10 - 15:30  Grand Ballroom I

 

ASR: Bayesian Modeling

 

 

Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data
Alexander Krueger1,  Oliver Walter2,  Volker Leutnant2,  Reinhold Haeb-Umbach2
1Research & Innovation, Technicolor, 30625 Hannover, Germany, 2Department of Communications Engineering, University of Paderborn, Germany


 

Paper Identifier: Tue.O4b.01

Tuesday

13:30 - 13:50  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

Robust Tracking for Automatic Reading Tutors
Emre Yilmaz,  Dirk Van Compernolle,  Hugo Van hamme
KU Leuven, Belgium


 

Paper Identifier: Tue.O4b.02

Tuesday

13:50 - 14:10  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning
Hao Huang,  Jianming Wang,  Halidan Abudureyimu
Xinjiang University, China


 

Paper Identifier: Tue.O4b.03

Tuesday

14:10 - 14:30  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

Error Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation Training
Yow-Bang Wang and Lin-Shan Lee
National Taiwan University, Taiwan


 

Paper Identifier: Tue.O4b.04

Tuesday

14:30 - 14:50  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

The Automatic Assessment of Non-native Prosody: Combining Classical Prosodic Analysis with Acoustic Modelling
Florian Hönig,  Tobias Bocklet,  Korbinian Riedhammer,  Anton Batliner,  Elmar Nöth
Universitaet Erlangen-Nuernberg, Germany


 

Paper Identifier: Tue.O4b.05

Tuesday

14:50 - 15:10  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

Improving L1-Specific Phonological Error Diagnosis in Computer Assisted Pronunciation Training
Theban Stanley and Kadri Hacioglu
Rosetta Stone, USA


 

Paper Identifier: Tue.O4b.06

Tuesday

15:10 - 15:30  Grand Ballroom II

 

Computer Assisted Language Learning I

 

 

A Self-Learning Assistive Vocal Interface Based on Vocabulary Learning and Grammar Induction
Jort F. Gemmeke1,  Janneke van de Loo2,  Guy De Pauw2,  Joris Driesen1,  Hugo Van hamme1,  Walter Daelemans2
1KU Leuven, Belgium, 2University of Antwerp, Belgium


 

Paper Identifier: Tue.O4c.01

Tuesday

13:30 - 13:50  Pavilion East

 

Conversation and Interaction II

 

 

Contrasting Cues to Verbal and Non-Verbal Backchannels in Multi-lingual Dyadic Rapport
Gina-Anne Levow1 and Susan Duncan2
1University of Washington, 2University of Chicago


 

Paper Identifier: Tue.O4c.02

Tuesday

13:50 - 14:10  Pavilion East

 

Conversation and Interaction II

 

 

Prosodic measurements and question types in the Spontal corpus of Swedish dialogues
Sofia Strömbergsson,  Jens Edlund,  David House
KTH, Sweden


 

Paper Identifier: Tue.O4c.03

Tuesday

14:10 - 14:30  Pavilion East

 

Conversation and Interaction II

 

 

Measuring prosodic alignment in cooperative task-based conversations
Khiet Truong and Dirk Heylen
University of Twente, The Netherlands


 

Paper Identifier: Tue.O4c.04

Tuesday

14:30 - 14:50  Pavilion East

 

Conversation and Interaction II

 

 

On the Dynamics of Overlap in Multi-Party Conversation
Kornel Laskowski1,  Mattias Heldner2,  Jens Edlund3
1Carnegie Mellon University, United States, 2Stockholm University, Sweden, 3KTH, Sweden


 

Paper Identifier: Tue.O4c.05

Tuesday

14:50 - 15:10  Pavilion East

 

Conversation and Interaction II

 

 

On the acoustics of overlapping laughter in conversational speech
Khiet Truong1 and Jürgen Trouvain2
1University of Twente, The Netherlands, 2Saarland University, Germany


 

Paper Identifier: Tue.O4c.06

Tuesday

15:10 - 15:30  Pavilion East

 

Conversation and Interaction II

 

 

A Corpus-Based Study of Interruptions in Spoken Dialogue
Agustin Gravano1 and Julia Hirschberg2
1Universidad de Buenos Aires, Argentina, 2Columbia University, USA


 

Paper Identifier: Tue.O4d.01

Tuesday

13:30 - 13:50  Pavilion West

 

Speech Analysis and Modeling

 

 

On the Modeling of Voiceless Stop Sounds of Speech using Adaptive Quasi-Harmonic Models
George Kafentzis1,  Olivier Rosec2,  Yannis Stylianou1
1Greece, 2France


 

Paper Identifier: Tue.O4d.02

Tuesday

13:50 - 14:10  Pavilion West

 

Speech Analysis and Modeling

 

 

An alignment matching method to explore pseudosyllable properties across different corpora
Raymond W. M. Ng1,  Thomas Hain1,  Keikichi Hirose2
1The University of Sheffield, United Kingdom, 2The University of Tokyo, Japan


 

Paper Identifier: Tue.O4d.03

Tuesday

14:10 - 14:30  Pavilion West

 

Speech Analysis and Modeling

 

 

Deep Architectures for Articulatory Inversion
Benigno Uria,  Iain Murray,  Steve Renals,  Korin Richmond
University of Edinburgh, U.K.


 

Paper Identifier: Tue.O4d.04

Tuesday

14:30 - 14:50  Pavilion West

 

Speech Analysis and Modeling

 

 

Automatic Measurement of Positive and Negative Voice Onset Time
Katharine Henry1,  Morgan Sonderegger1,  Joseph Keshet2
1University of Chicago, 2TTI-Chicago


 

Paper Identifier: Tue.O4d.05

Tuesday

14:50 - 15:10  Pavilion West

 

Speech Analysis and Modeling

 

 

Efficient multipulse approximation of speech excitation using the most singular manifold
Vahid Khanagha and Khalid Daoudi
INRIA Bordeaux Sud-Ouest, France


 

Paper Identifier: Tue.O4d.06

Tuesday

15:10 - 15:30  Pavilion West

 

Speech Analysis and Modeling

 

 

Intrinsic Spectral Analysis for Zero and High Resource Speech Recognition
Aren Jansen,  Samuel Thomas,  Hynek Hermansky
Johns Hopkins University


 

Paper Identifier: Tue.SS4.01

Tuesday

13:30 - 13:50  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment
Maider Lehr,  Emily Emily Prud’hommeaux,  Izhak Shafran,  Brian Roark
USA


 

Paper Identifier: Tue.SS4.02

Tuesday

13:50 - 14:10  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting Psychologist
Daniel Bone1,  Matthew P. Black1,  Chi-Chun Lee1,  Marian E. Williams2,  Pat Levitt3,  Sungbok Lee1,  Shrikanth Narayanan1
1Signal Analysis and Interpretation Laboratory (SAIL), USC, Los Angeles, CA, USA, 2University Center for Excellence in Developmental Disabilities, Keck School of Medicine of USC, 3Zilka Neurogenic Institute, Keck School of Medicine of USC


 

Paper Identifier: Tue.SS4.03

Tuesday

14:10 - 14:30  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Contrastive intonation in autism: The effect of speaker- and listener-perspective
Constantijn Kaland,  Emiel Krahmer,  Marc Swerts
Tilburg centre for Communication and Cognition (TiCC), The Netherlands


 

Paper Identifier: Tue.SS4.04

Tuesday

14:30 - 14:50  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Characterizing Covert Articulation in Apraxic Speech Using real-time MRI
Christina Hagedorn1,  Michael Proctor1,  Louis Goldstein1,  Maria Luisa Gorno-Tempini2,  Shrikanth Narayanan1
1University of Southern California, USA, 2University of California-San Francisco, USA


 

Paper Identifier: Tue.SS4.05

Tuesday

14:50 - 15:10  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Automatic word naming recognition for treatment and assessment of aphasia
Alberto Abad1,  Anna Pompili1,  Angela Costa1,  Isabel Trancoso2
1INESC-ID Lisboa, Portugal, 2INESC-ID Lisboa/IST, Portugal


 

Paper Identifier: Tue.SS4.06

Tuesday

15:10 - 15:30  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 1

 

 

Vocal-Source Biomarkers for Depression: A Link to Psychomotor Activity
Thomas Quatieri and Nicolas Malyska
MIT Lincoln Laboratory, US


 

Paper Identifier: Tue.P4a.01

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Computational Modelling of the Recognition of Foreign-Accented Speech
Odette Scharenborg,  Marijt Witteman,  Andrea Weber
Max Planck Institute for Psycholinguistics, the Netherlands


 

Paper Identifier: Tue.P4a.02

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

The production and perception of Estonian quantity degrees by native and non-native speakers
Lya Meister and Einar Meister
Tallinn University of Technology, Estonia


 

Paper Identifier: Tue.P4a.03

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Perception of the moraic obstruent /Q/: a cross-linguistic study
Makiko Sadakata1,  Mizuki Shingai2,  Alex Brandmeyer1,  Kaoru Sekiyama2
1Donders institute for Brain, Cognition and Behaviour, Centre for Cognition, The Netherlands, 2Division of Cognitive Psychology, Kumamoto University, Japan


 

Paper Identifier: Tue.P4a.04

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of English
Tomoko Nariai1,  Kazuyo Tanaka2,  Tatsuya Kawahara3
1Ibaraki Women’s Junior College, 2University of Tsukuba, 3Kyoto University


 

Paper Identifier: Tue.P4a.05

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Auditory and Dynamic Modeling Paradigms to Detect L2 Mispronunciations
Christos Koniaris,  Olov Engwall,  Giampiero Salvi
KTH - Royal Institute of Technology, Sweden


 

Paper Identifier: Tue.P4a.06

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data
Sheng Li and Lan Wang
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China


 

Paper Identifier: Tue.P4a.07

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Physiological and acoustic study of word initial post-lexical gemination in Moroccan Arabic
Chakir Zeroual1,  Diamantis Gafos2,  Phil Hoole3,  John Esling4
1Faculté Polydisciplinaire de Taza, BP. 1223, Taza, Maroc. & Laboratoire de Phonétique et Phonologie, CNRS-UMR7018,Sorbonne Nouvelle, Paris., 2Haskins Laboratories, New Haven, USA., 3Institut fuer Phonetik und Sprachverarbeitung, University of Munchen, Germany., 4Department of Linguistics, University of Victoria, Canada.


 

Paper Identifier: Tue.P4a.08

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Perceptual Assimilation of Arabic Voiceless Fricatives by English Monolinguals
Michael Tyler and Sarah Fenwick
University of Western Sydney, Australia


 

Paper Identifier: Tue.P4a.09

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Non-auditory cognitive capabilities in computational modeling of early language acquisition
Okko Räsänen
Department of Signal Processing and Acoustics, Aalto University, Finland


 

Paper Identifier: Tue.P4a.10

Tuesday

13:30 - 15:30  Exhibition Hall

 

Language Learning and Cross-Language Production and Perception

 

 

Modeling spoken language acquisition with a generic cognitive architecture for associative learning
Okko Räsänen,  Heikki Rasilo,  Unto K. Laine
Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland


 

Paper Identifier: Tue.P4b.01

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Pitch Estimation Based on Long Frame Harmonic Model and Short Frame Average Correlation Coefficient
Dongmei Wang and Philipos C. Loizou
University of Texas at Dallas, USA


 

Paper Identifier: Tue.P4b.02

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Diagnostic Prediction of Transmitted Speech Quality: A New Framework for Signal-based and Parametric Models
Sebastian Möller1,  Marcel Wältermann1,  Nicolas Côté2
1Quality and Usability Lab, Telekom Innovation Laboratories, TU Berlin, Germany, 2Institute of Electronics, Microelectronics and Nanotechnology, UMR CNRS 8520, ISEN, Lille, France


 

Paper Identifier: Tue.P4b.03

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Enumerative Algebraic Coding for ACELP
Tom Bäckström
Frauhofer IIS, Germany


 

Paper Identifier: Tue.P4b.04

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Speech Enhancement With Bivariate Gamma Model
Atanu Saha and Tetsuya Shimamura
Saitama University, Japan


 

Paper Identifier: Tue.P4b.05

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Improvements of the Beta-Order Minimum Mean-Square Error (MMSE) Spectral Amplitude Estimator using Chi Priors
Marek Trawicki and Michael Johnson
Marquette University, USA


 

Paper Identifier: Tue.P4b.06

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Enhancing Speech by Reconstruction from Robust Acoustic Features
Philip Harding and Ben Milner
University of East Anglia, UK


 

Paper Identifier: Tue.P4b.07

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Joint Pitch-Analysis Formant-Synthesis framework for CS recovery of speech
Srikanth Raj Chetupally and Sreenivas V. Thippur
Indian Institute of Science, Bangalore, India


 

Paper Identifier: Tue.P4b.08

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio masking
Shan Liang,  Wei Jiang,  Wenju Liu
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing


 

Paper Identifier: Tue.P4b.09

Tuesday

13:30 - 15:30  Exhibition Hall

 

Enhancement and Coding

 

 

Optimised spectral weightings for noise-dependent speech intelligibility enhancement
Yan Tang1 and Martin Cooke2
1Language and Speech Laboratory, Universidad del Pa ́ıs Vasco, 2Ikerbasque (Basque Science Foundation)


 

Paper Identifier: Tue.P4c.01

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training
Langzhou Chen1,  Mark Gales1,  Vincent Wan1,  Javier Latorre1,  Masami Akamine2
1Toshiba Research Europe Ltd., UK, 2Toshiba Corporate Research & Development Center, Japan


 

Paper Identifier: Tue.P4c.02

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS
Ji He1,  Yao Qian1,  Frank K. Soong1,  Sheng Zhao2
1Microsoft Research Asia, China, 2Microsoft Search Technology Center Asia, China


 

Paper Identifier: Tue.P4c.03

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders
Christophe Veaux,  Junichi Yamagishi,  Simon King
CSTR, UK


 

Paper Identifier: Tue.P4c.04

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Speech factorization for HMM-TTS based on cluster adaptive training
Javier Latorre,  Vincent Wan,  Mark J.F. Gales,  Langzhou Chen,  K.K. Chin,  Kate Knill,  Masami Akamine
Toshiba Research Europe, UK


 

Paper Identifier: Tue.P4c.05

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS
June Sig Sung,  Doo Hwa Hong,  Hyun Woo Koo,  Nam Soo Kim
Seoul National University, Korea, South


 

Paper Identifier: Tue.P4c.06

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Speaker-adaptive visual speech synthesis in the HMM-framework
Dietmar Schabus,  Michael Pucher,  Gregor Hofer
FTW Telecommunications Research Center Vienna, Austria


 

Paper Identifier: Tue.P4c.07

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
Viviane de Franca Oliveira,  Sayaka Shiota,  Yoshihiko Nankaku,  Keiichi Tokuda
Department of Computer Science and Engineering, Nagoya Institute of Technology, Nagoya, Japan


 

Paper Identifier: Tue.P4c.08

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech
Mauro Nicolao1,  Javier Latorre2,  Roger K. Moore1
1Speech and Hearing Group, Dept. Computer Science, University of Sheffield, UK, 2Toshiba Research Europe Ltd., Cambridge Research Laboratory, UK


 

Paper Identifier: Tue.P4c.09

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis
Zhen-Hua Ling1,  Korin Richmond2,  Junichi Yamagishi2
1University of Science and Technology of China, China, 2CSTR, University of Edinburgh, United Kingdom


 

Paper Identifier: Tue.P4c.10

Tuesday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Adaptation

 

 

Analysis of speaker clustering strategies for HMM-based speech synthesis
Rasmus Dall,  Christophe Veaux,  Junichi Yamagishi,  Simon King
University of Edinburgh, UK


 

Paper Identifier: Tue.P4d.01

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Word Relevance Modeling for Speech Recognition
Kuan-Yu Chen1,  Hao-Chin Chang2,  Berlin Chen2,  Hsin-Min Wang1
1Institute of Information Science, Academia Sinica, Taipei, Taiwan, 2National Taiwan Normal University, Taipei, Taiwan


 

Paper Identifier: Tue.P4d.02

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Using context-free grammars for embedded speech recognition with Weighted Finite-State Transducers
Frank Duckhorn and Rüdiger Hoffmann
Technische Universität Dresden, Germany


 

Paper Identifier: Tue.P4d.03

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Automatic transcription error recovery for Person Name Recognition
Richard Dufour1,  Géraldine Damnati1,  Delphine Charlet1,  Frédéric Béchet2
1Orange Labs, France, 2Aix Marseille Université, France


 

Paper Identifier: Tue.P4d.04

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range Normalization
Satoshi KOBASHIKAWA1,  Takaaki HORI2,  Yoshikazu YAMAGUCHI1,  Taichi ASAMI1,  Hirokazu MASATAKI1,  Satoshi TAKAHASHI1
1NTT Cyber Space Laboratories, Japan, 2NTT Communication Science Laboratories, Japan


 

Paper Identifier: Tue.P4d.05

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Search Space Pruning Based on Anticipated Path Recombination in LVCSR
David Nolden,  Ralf Schlüter,  Hermann Ney
RWTH Aachen


 

Paper Identifier: Tue.P4d.06

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Estimating Word-Stability During Incremental Speech Recognition
Ian McGraw1 and Alex Gruenstein2
1Massachusetts Institute of Technology, USA, 2Google, USA


 

Paper Identifier: Tue.P4d.07

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Using broad phonetic classes to guide search in automatic speech recognition
Stefan Ziegler,  Bogdan Ludusan,  Guillaume Gravier
CNRS-IRISA, France


 

Paper Identifier: Tue.P4d.08

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Parallel combination of multilingual speech streams for improved ASR
João Miranda1,  João Neto2,  Alan Black3
1INESC-ID/IST, Portugal, SCS/CMU , USA, 2INESC-ID/IST, Portugal, 3SCS /CMU, USA


 

Paper Identifier: Tue.P4d.09

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Low latency combination of parallelized single-pass LVCSR systems
Fethi Bougares1,  Mickael Rouvier1,  Yannick Estève1,  Georges Linarès2
1LIUM, France, 2LIA, France


 

Paper Identifier: Tue.P4d.10

Tuesday

13:30 - 15:30  Exhibition Hall

 

Search and Decoding

 

 

Efficient On-The-Fly Hypothesis Rescoring in a Hybrid GPU/CPU-based Large Vocabulary Continuous Speech Recognition Engine
Jungsuk Kim,  Jike Chong,  Ian Lane
Carnegie Mellon University, United State


 

Paper Identifier: Tue.O5a.01

Tuesday

16:00 - 16:20  Grand Ballroom I

 

Dynamic Decoding

 

 

Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networks
Preethi Jyothi1,  Eric Fosler-Lussier1,  Karen Livescu2
1The Ohio State University, USA, 2Toyota Technological Institute at Chicago, USA


 

Paper Identifier: Tue.O5a.02

Tuesday

16:20 - 16:40  Grand Ballroom I

 

Dynamic Decoding

 

 

Joint Decoding for Speech Recognition and Semantic Tagging
Anoop Deoras,  Ruhi Sarikaya,  Gokhan Tur,  Dilek Hakkani-Tur
Speech Labs, Microsoft Corporation, USA


 

Paper Identifier: Tue.O5a.03

Tuesday

16:40 - 17:00  Grand Ballroom I

 

Dynamic Decoding

 

 

Investigation of Maximum Entropy Hybrid Language Models for Open Vocabulary German and Polish LVCSR
M. Ali Basha Shaik,  Amr El-Desoky Mousa,  Ralf Schlüter,  Hermann Ney
RWTH Aachen University, Germany


 

Paper Identifier: Tue.O5a.04

Tuesday

17:00 - 17:20  Grand Ballroom I

 

Dynamic Decoding

 

 

A Specialized WFST Approach for Class Models and Dynamic Vocabulary
Paul R. Dixon,  Chiori Hori,  Hideki Kashioka
NICT, Japan


 

Paper Identifier: Tue.O5a.05

Tuesday

17:20 - 17:40  Grand Ballroom I

 

Dynamic Decoding

 

 

Dynamic Grammars with Lookahead Composition for WFST-based Speech Recognition
Josef Novak,  Nobuaki Minematsu,  Keikichi Hirose
The University of Tokyo, Japan


 

Paper Identifier: Tue.O5a.06

Tuesday

17:40 - 18:00  Grand Ballroom I

 

Dynamic Decoding

 

 

Knowledge-Based Word Lattice Rescoring in a Dynamic Context
Todd Shore1,  Friedrich Faubel1,  Hartmut Helmke2,  Dietrich Klakow1
1Saarland University, Germany, 2German Aerospace Center (DLR), Germany


 

Paper Identifier: Tue.O5b.01

Tuesday

16:00 - 16:20  Grand Ballroom II

 

Speaker Recognition I

 

 

Mixture Component Clustering for Efficient Speaker Verification
Richard McClanahan1 and Phillip De Leon2
1Sandia National Laboratories, USA, 2New Mexico State University, USA


 

Paper Identifier: Tue.O5b.02

Tuesday

16:20 - 16:40  Grand Ballroom II

 

Speaker Recognition I

 

 

Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition
Taufiq Hasan and John Hansen
The University of Texas at Dallas


 

Paper Identifier: Tue.O5b.03

Tuesday

16:40 - 17:00  Grand Ballroom II

 

Speaker Recognition I

 

 

Query-by-Example using Speaker Content Graphs
William Campbell and Elliot Singer
MIT Lincoln Laboratory, USA


 

Paper Identifier: Tue.O5b.04

Tuesday

17:00 - 17:20  Grand Ballroom II

 

Speaker Recognition I

 

 

Unsupervised NAP Training Data Design for Speaker Recognition
Hanwu Sun and Bin Ma
Institute for Infocomm Research, Singapore


 

Paper Identifier: Tue.O5b.05

Tuesday

17:20 - 17:40  Grand Ballroom II

 

Speaker Recognition I

 

 

The Role of Score Calibration in Speaker Recognition
George Doddington
US


 

Paper Identifier: Tue.O5b.06

Tuesday

17:40 - 18:00  Grand Ballroom II

 

Speaker Recognition I

 

 

A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures
Takafumi HATTORI,  Kei HASHIMOTO,  Yoshihiko NANKAKU,  Keiichi TOKUDA
Nagoya Institute of Technology, JAPAN


 

Paper Identifier: Tue.O5c.01

Tuesday

16:00 - 16:20  Pavilion East

 

Development of Speech Production and Perception

 

 

Similarities in fundamental frequency in infant speech segmentation models
Ellen Marklund,  Francisco Lacerda,  Iris-Corinna Schwarz,  Ulla Sundberg
Department of Linguistics, Stockholm University, Sweden


 

Paper Identifier: Tue.O5c.02

Tuesday

16:20 - 16:40  Pavilion East

 

Development of Speech Production and Perception

 

 

Phonological complexity and vocabulary size in 30-month-old Swedish children
Ulrika Marklund,  Ulla Sundberg,  Iris-Corinna Schwarz,  Francisco Lacerda
Department of Linguistics, Stockholm University, Stockholm, Sweden


 

Paper Identifier: Tue.O5c.03

Tuesday

16:40 - 17:00  Pavilion East

 

Development of Speech Production and Perception

 

 

Auditory-visual speech to infants and adults: signals and correlations
Jeesun Kim,  Chris Davis,  Christine Kitamura
University of Western Sydney, Australia


 

Paper Identifier: Tue.O5c.04

Tuesday

17:00 - 17:20  Pavilion East

 

Development of Speech Production and Perception

 

 

Objective Child Vocal Development Measurement with Naturalistic Daylong Audio Recording
Dongxin Xu,  Jill Gilkerson,  Jeffery Richards
LENA Research Foundation, USA


 

Paper Identifier: Tue.O5c.05

Tuesday

17:20 - 17:40  Pavilion East

 

Development of Speech Production and Perception

 

 

Speech Production-Perception Relationships in Children with Speech Delay
Kyoko Nagao,  Mark Paullin,  Vilena Livinsky,  James B. Polikoff,  Linda D. Vallino,  Thierry G. Morlet,  N. Carolyn Schanen,  H. Timothy Bunnell
Nemours Biomedical Research, USA


 

Paper Identifier: Tue.O5c.06

Tuesday

17:40 - 18:00  Pavilion East

 

Development of Speech Production and Perception

 

 

Synthetic correction of deviant speech – children’s perception of phonologically modified recordings of their own speech
Sofia Strömbergsson
Dept of Speech, Music and Hearing, School of Computer Science and Communication, KTH (Royal Institute of Technology), Sweden


 

Paper Identifier: Tue.O5d.01

Tuesday

16:00 - 16:20  Pavilion West

 

HMM Synthesis I

 

 

Combining multiple high quality corpora for improving HMM-TTS
Vincent Wan1,  Javier Latorre1,  K.K. Chin2,  Langzhou Chen1,  Mark J. F. Gales1,  Heiga Zen3,  Kate Knill1,  Masami Akamine1
1Toshiba Research Europe, UK, 2Google, USA, 3Google, UK


 

Paper Identifier: Tue.O5d.02

Tuesday

16:20 - 16:40  Pavilion West

 

HMM Synthesis I

 

 

An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis
Shinnosuke Takamichi,  Tomoki Toda,  Yoshinori Shiga,  Hisashi Kawai,  Sakriani Sakti,  Satoshi Nakamura
Japan


 

Paper Identifier: Tue.O5d.03

Tuesday

16:40 - 17:00  Pavilion West

 

HMM Synthesis I

 

 

Using Bayesian Networks to find relevant context features for HMM-based speech synthesis
Heng Lu and Simon King
University of Edinburgh, UK


 

Paper Identifier: Tue.O5d.04

Tuesday

17:00 - 17:20  Pavilion West

 

HMM Synthesis I

 

 

Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis
Xiang Yin,  Zhen-Hua Ling,  Ming Lei,  Li-Rong Dai
University of Science and Technology of China, P.R.China


 

Paper Identifier: Tue.O5d.05

Tuesday

17:20 - 17:40  Pavilion West

 

HMM Synthesis I

 

 

A speech parameter generation algorithm using local variance for HMM-based speech synthesis
Vataya Chunwijitra,  Takashi Nose,  Takao Kobayashi
Tokyo Institute of Technology, Japan


 

Paper Identifier: Tue.O5d.06

Tuesday

17:40 - 18:00  Pavilion West

 

HMM Synthesis I

 

 

Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSP
Yamato Ohtani,  Masatsune Tamura,  Masahiro Morita,  Takehiko Kagoshima,  Masami Akamine
Toshiba Corporation, Japan


 

Paper Identifier: Tue.SS5.01

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Audio and Contact Microphones for Cough Detection
Thomas Drugman1,  Jerome Urbain1,  Nathalie Bauwens2,  Ricardo Chessini1,  Anne-Sophie Aubriot2,  Patrick Lebecque2,  Thierry Dutoit1
1University of Mons, Belgium, 2University of Louvain, Belgium


 

Paper Identifier: Tue.SS5.02

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Analyzing and Interpreting Automatically Learned Rules Across Dialects
Nancy Chen1,  Wade Shen2,  Joseph Campbell2
1Institute for Infocomm Research, Singapore; MIT, USA, 2MIT/Lincoln Laboratory, USA


 

Paper Identifier: Tue.SS5.03

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

The Effect of Use of Drugs on Speaker’s Fundamental Frequency and Formants
Andrey Raev1,  Yuri Matveev1,  Tatiana Goloshchapova2
1Speech Technology Center, Russia, 2Federal Drug Control Service of the Russian Federation, Russia


 

Paper Identifier: Tue.SS5.04

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

On the assessment of audiovisual cues to speaker confidence by preteens with typical development (TD) and a-typical development (AD)
Marc Swerts and Cees de Bie
Tilburg University, The Netherlands


 

Paper Identifier: Tue.SS5.05

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Interplay between verbal response latency and physiology of children with autism during ECA interactions
Theodora Chaspari,  Chi-Chun Lee,  Shrikanth Narayanan
Signal Analysis and Interpretation Laboratory, University of Southern California, USA


 

Paper Identifier: Tue.SS5.06

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Combination of Multiple Speech Dimensions for Automatic Assessment of Dysarthric Speech Intelligibility
Myung Jong Kim and Hoirin Kim
Korea Advanced Institute of Science and Technology (KAIST), Republic of Korea


 

Paper Identifier: Tue.SS5.07

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Whole-Word Recognition from Articulatory Movements for Silent Speech Interfaces
Jun Wang1,  Ashok Samal1,  Jordan R. Green1,  Frank Rudzicz2
1University of Nebraska-Lincoln, United States, 2University of Toronto, Canada


 

Paper Identifier: Tue.SS5.08

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Verifying Session Level Pronunciation Accuracy in a Speech Therapy Application
Shou-Chun Yin1,  Richard Rose1,  Yun Tang2
1McGill University, Montreal, Canada, 2Nuance Communications Inc., Montreal, Canada


 

Paper Identifier: Tue.SS5.09

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Duration of ambulatory monitoring needed to accurately estimate voice use
Daryush Mehta1,  Rebecca Woodbury Listfield1,  Harold Cheyne II2,  James Heaton1,  Shengran Feng1,  Matías Zañartu3,  Robert Hillman1
1Ctr. for Laryngeal Surgery & Voice Rehab., Mass. General Hospital, Boston, Massachusetts, USA, 2Lab of Ornithology, Cornell University, Ithaca, New York, USA, 3Dept. of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile


 

Paper Identifier: Tue.SS5.10

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts
Khairun-nisa Hassanali1,  Yang Liu1,  Thamar Solorio2
1The University of Texas at Dallas, USA, 2University of Alabama at Birmingham, USA


 

Paper Identifier: Tue.SS5.11

Tuesday

16:00 - 18:00  Galleria

 

Analysis of Spoken Disorders in Health Applications - Part 2

 

 

Quantitative Analysis of Pitch in Speech of Children with Neurodevelopmental Disorders
Geza Kiss,  Jan P.H. van Santen,  Emily Tucker Prud'hommeaux,  Lois M. Black
CSLU, OHSU, Oregon, USA


 

Paper Identifier: Tue.P5a.01

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender
Felix Weninger,  Erik Marchi,  Björn Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Tue.P5a.02

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

Speaker Clustering in Emotion Recognition
Ni Ding1 and Julien Epps2
1The School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney NSW 2052, Australia, 2ATP Laboratory, National ICT Australia, Sydney NSW 2015, Australia


 

Paper Identifier: Tue.P5a.03

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

Automatic detection of conflict escalation in spoken conversations
Samuel Kim,  Sree Yella,  Fabio Valente
IDIAP Research Institute, Switzerland


 

Paper Identifier: Tue.P5a.04

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

The entropy of intoxicated speech - lexical creativity and heavy tongues
Uwe Reichel
University of Munich, Germany


 

Paper Identifier: Tue.P5a.05

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora Evaluation
Daniel Bone,  Chi-Chun Lee,  Shrikanth S. Narayanan
SAIL, University of Southern California, United States


 

Paper Identifier: Tue.P5a.06

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

Unveiling the Acoustic Properties that Describe the Valence Dimension
Carlos Busso and Tauhidur Rahman
The University of Texas at Dallas, USA


 

Paper Identifier: Tue.P5a.07

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus
Fabio Valente,  Samuel Kim,  Petr Motlicek
Idiap Research Institute, Switzerland


 

Paper Identifier: Tue.P5a.08

Tuesday

16:00 - 18:00  Exhibition Hall

 

Paralinguistics II

 

 

The Effects of Lexical Tones and Nasal Coda /-n/ to Sadness in Taiwan Hakka
Shao-ren Lyu
National Chiao Tung University, Taiwan


 

Paper Identifier: Tue.P5b.01

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Comparing different acoustic modeling techniques for multilingual boosting
David Imseng,  John Dines,  Petr Motlicek,  Philip N. Garner,  Hervé Bourlard
Idiap Research Institute, Switzerland


 

Paper Identifier: Tue.P5b.02

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Model-based approaches to adaptive training in reverberant environments
Yongqiang Wang and Mark Gales
Department of Engineering, Cambridge University


 

Paper Identifier: Tue.P5b.03

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Model-Based Approaches for Degraded Channel Modelling in Robust ASR
Mark Gales and Federico Flego
Cambridge University, UK


 

Paper Identifier: Tue.P5b.04

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Improved Model Selection for the ASR-Driven Binary Mask
William Hartmann and Eric Folser-Lussier
The Ohio State University, USA


 

Paper Identifier: Tue.P5b.05

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Accelerated Batch Learning of Convex Log-linear Models for LVCSR
Simon Wiesler,  Ralf Schlüter,  Hermann Ney
RWTH Aachen


 

Paper Identifier: Tue.P5b.06

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition
Janne Pylkkönen and Mikko Kurimo
Aalto University, Finland


 

Paper Identifier: Tue.P5b.07

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Semi-Supervised Methods for Improving Keyword Search of Unseen Terms
Scott Novotney1,  Ivan Bulyko2,  Rich Schwartz2,  Sanjeev Khudanpur1,  Owen Kimball2
1JHU, USA, 2BBN, USA


 

Paper Identifier: Tue.P5b.08

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition
Xiangang Li,  Dan Su,  Zaihu Pang,  Xihong Wu
Peking University, China


 

Paper Identifier: Tue.P5b.09

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

Classification of Stressed Speech Using Physical Parameters Derived from Two-Mass Model
Yao Xiao1,  Jitsuhiro Takatoshi2,  Miyajima Chiyomi1,  Kitaoka Norihide1,  Takeda Kazuya1
1Nagoya University, 2Nagoya University/Aichi University of Technology


 

Paper Identifier: Tue.P5b.10

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Modeling

 

 

IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition
Jun Du and Qiang Huo
Microsoft Research Asia


 

Paper Identifier: Tue.P5c.01

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?
Niko Moritz,  Jörn Anemüller,  Birger Kollmeier
Germany


 

Paper Identifier: Tue.P5c.02

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Effect of speech priors in single-channel speech-music separation for ASR
Cemil Demir1,  Ali Taylan Cemgil2,  Murat Saraçlar3
1Tübitak-Bilgem, 2Computer Engineering,Boğaziçi University, 3Electrical Engineering, Boğaziçi University


 

Paper Identifier: Tue.P5c.03

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

On the Role of Binary Mask Pattern in Automatic Speech Recognition
Arun Narayanan and DeLiang Wang
The Ohio State University, United States


 

Paper Identifier: Tue.P5c.04

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition
Tatsuya Kawahara and Randy Gomez
Kyoto University, Japan


 

Paper Identifier: Tue.P5c.05

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Spectral Intersections for Non-Stationary Signal Separation
Trausti Kristjansson1 and Thad Hughes2
1University of Reykjavik, Iceland, 2Google Inc.


 

Paper Identifier: Tue.P5c.06

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment
Kyohei Odani,  Longbiao Wang,  Atsuhiko Kai
Shizuoka University, Japan


 

Paper Identifier: Tue.P5c.07

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Q-Gaussian based spectral subtraction for robust speech recognition
Hilman Ferdinandus Pardede1,  Koichi Shinoda1,  Koji Iwano2
1Tokyo Institute of Technology, Japan, 2Tokyo City University, Japan


 

Paper Identifier: Tue.P5c.08

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition
Bernd T. Meyer1,  Constantin Spille1,  Birger Kollmeier1,  Nelson Morgan2
1Medical Physics, University Oldenburg, 2International Computer Science Institute, Berkeley, CA, USA


 

Paper Identifier: Tue.P5c.09

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

FEATURE EXTRACTION BASED ON HEARING SYSTEM SIGNAL PROCESSING FOR ROBUST LARGE VOCABULARY SPEECH RECOGNITION
Peter Li and Xie Sun
Li Creative Technologies, Inc. USA


 

Paper Identifier: Tue.P5c.10

Tuesday

16:00 - 18:00  Exhibition Hall

 

ASR: Robust Features I

 

 

Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions
Harish Arsikere1,  Gary Leung1,  Steven Lulich2,  Abeer Alwan1
1UCLA, USA, 2Washington University, Saint Louis, USA


 

Paper Identifier: Tue.P5d.01

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Real-time Visualization of English Pronunciation on an IPA Chart Based on Articulatory Feature Extraction
Yurie Iribe1,  Takurou Mori1,  Kouichi Katsurada1,  Goh Kawai2,  Tsuneo Nitta1
1Toyohashi University of Technology, Japan, 2Hokkaido University, Japan


 

Paper Identifier: Tue.P5d.02

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Acoustic Feature-based Non-scorable Response Detection for an Automated Speaking Proficiency Assessment
Je Hun Jeon1 and Su-Youn Yoon2
1University of Texas at Dallas, 2Educational Testing Service


 

Paper Identifier: Tue.P5d.03

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Pronunciation quality evaluation of sentences by combining word based scores
Jorge Wuth,  Néstor Becerra Yoma,  Leopoldo Benavides,  Hiram Vivanco
Universidad de Chile, Chile


 

Paper Identifier: Tue.P5d.04

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Designing a spoken language interface for a tutorial dialogue system
Peter Bell,  Myroslava Dzikovska,  Amy Isard
University of Edinburgh, UK


 

Paper Identifier: Tue.P5d.05

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Automatic Pronunciation Error Detection Based on Extended Pronunciation Space Using the Unsupervised Clustering of Pronunciation Errors
Long Zhang and Haifeng Li
School of Computer Science and Technology, Harbin Institute of Technology,China


 

Paper Identifier: Tue.P5d.06

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Less errors with TTS? A dictation experiment with foreign language learners
Thomas Pellegrini1,  Ângela Costa2,  Isabel Trancoso3
1INESC-ID, Portugal, 2INESC-ID, UNL, Portugal, 3INESC-ID, IST, Portugal


 

Paper Identifier: Tue.P5d.07

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Improvement in Automatic Pronunciation Scoring using Additional Basic Scores and Learning to Rank
Liang-Yu Chen1 and Jyh-Shing Roger Jang2
1Institute of Information Systems and Applications, National Tsing Hua University, Taiwan, 2Department of Computer Science, National Tsing Hua University, Taiwan


 

Paper Identifier: Tue.P5d.08

Tuesday

16:00 - 18:00  Exhibition Hall

 

Computer Assisted Language Learning II

 

 

Automatic Tone Assessment of Non-Native Mandarin Speakers
Jian Cheng
Pearson, USA


 

Paper Identifier: Wed.O6a.01

Wednesday

10:00 - 10:20  Grand Ballroom I

 

ASR: Robust Features II

 

 

Robust phoneme recognition based on biomimetic speech contours
Michael Carlin,  Kailash Patil,  Sridhar Krishna Nemala,  Mounya Elhilali
Johns Hopkins University, USA


 

Paper Identifier: Wed.O6a.02

Wednesday

10:20 - 10:40  Grand Ballroom I

 

ASR: Robust Features II

 

 

A Feature Space Transformation Method for Personalization using Generalized I-Vector Clustering
Kaisheng Yao,  Yifan Gong,  Chaojun Liu
Microsoft, USA


 

Paper Identifier: Wed.O6a.03

Wednesday

10:40 - 11:00  Grand Ballroom I

 

ASR: Robust Features II

 

 

Longer Features: They do a speech detector good
TJ Tsai and Nelson Morgan
Berkeley/ICSI, USA


 

Paper Identifier: Wed.O6a.04

Wednesday

11:00 - 11:20  Grand Ballroom I

 

ASR: Robust Features II

 

 

Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum
Md Jahangir Alam1,  Patrick Kenny2,  Douglas O'Shaughnessy1
1INRS-EMT, Canada, 2CRIM, Canada


 

Paper Identifier: Wed.O6a.05

Wednesday

11:20 - 11:40  Grand Ballroom I

 

ASR: Robust Features II

 

 

Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech Recognition
Florian Müller and Alfred Mertins
Institute for Signal Processing, University of Lübeck, Germany


 

Paper Identifier: Wed.O6a.06

Wednesday

11:40 - 12:00  Grand Ballroom I

 

ASR: Robust Features II

 

 

Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenarios
Hannes Pessentheiner1,  Stefan Petrik2,  Harald Romsdorfer2
1TU Graz, Austria, 2SYNVO, Austria


 

Paper Identifier: Wed.O6b.01

Wednesday

10:00 - 10:20  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs
Ales Prazak,  Zdenek Loose,  Jan Trmal,  Josef V. Psutka,  Josef Psutka
University of West Bohemia, Plzen, Czech Republic


 

Paper Identifier: Wed.O6b.02

Wednesday

10:20 - 10:40  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Development and Evaluation of Automatic Punctuation for French and English Speech-to-Text
Jachym Kolar and Lori Lamel
LIMSI-CNRS, France


 

Paper Identifier: Wed.O6b.03

Wednesday

10:40 - 11:00  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Spoken Document Clustering Using Word Confusion Networks
Shajith Ikbal,  Sachindra Joshi,  Ashish Verma,  Om D Deshmukh
IBM Research, India


 

Paper Identifier: Wed.O6b.04

Wednesday

11:00 - 11:20  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Dynamic Conditional Random Fields for Joint Sentence Boundary and Punctuation Prediction
Xuancong Wang1,  Hwee Tou Ng1,  Khe Chai Sim2
1National University of Singapore, Graduate School for Integrative Science and Engineering, Department of Computer Science, Singapore, 2National University of Singapore, Department of Computer Science, Singapore


 

Paper Identifier: Wed.O6b.05

Wednesday

11:20 - 11:40  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Analysis of the Characteristics of Talk-show TV Programs
Fabio Brugnara,  Daniele Falavigna,  Diego Giuliani,  Roberto Gretter
Fondazione Bruno Kessler


 

Paper Identifier: Wed.O6b.06

Wednesday

11:40 - 12:00  Grand Ballroom II

 

ASR: Rich Transcription

 

 

Rethinking The Corpus: Moving towards Dynamic Linguistic Resources
Andrew Rosenberg
Queens College / CUNY, USA


 

Paper Identifier: Wed.O6c.01

Wednesday

10:00 - 10:20  Pavilion East

 

Phonetics and Phonology

 

 

Effects of stress and speech rate on vowel quality in Catalan and Spanish
Marianna Nadeu
University of Illinois at Urbana-Champaign, USA


 

Paper Identifier: Wed.O6c.02

Wednesday

10:20 - 10:40  Pavilion East

 

Phonetics and Phonology

 

 

Predictability affects vowel dispersion and dynamics in the Buckeye Corpus
Michael McAuliffe and Molly Babel
University of British Columbia, Canada


 

Paper Identifier: Wed.O6c.03

Wednesday

10:40 - 11:00  Pavilion East

 

Phonetics and Phonology

 

 

Dialectal and generational variations in vowels in spontaneous speech
Robert Allen Fox and Ewa Jacewicz
Department of Speech and Hearing Science, The Ohio State University, Columbus, OH, USA


 

Paper Identifier: Mon.P1a.08 (Replaced withdrawn paper Wed.O6c.04)

Wednesday

11:00 - 11:20  Pavilion East

 

Phonetics and Phonology

 

 

Assessing agreement level between forced alignment models with data from endangered language documentation corpora
Christian DiCanio1,  Hosung Nam1,  Douglas H. Whalen1,  H. Timothy Bunnell2,  Jonathan D. Amith3,  Rey Castillo García4
1Haskins Laboratories, USA, 2University of Delaware, USA, 3Gettysburg College, USA, 4CIESAS, Mexico


 

Paper Identifier: Wed.O6c.05

Wednesday

11:20 - 11:40  Pavilion East

 

Phonetics and Phonology

 

 

Acoustic Cues of Vowel Quality to Coda Nasal Perception in Southern Min
Ying Chen,  Vsevolod Kapatsinski,  Susan Guion-Anderson
Department of Linguistics, University of Oregon, USA


 

Paper Identifier: Wed.O6c.06

Wednesday

11:40 - 12:00  Pavilion East

 

Phonetics and Phonology

 

 

Lenition of /d/ in spontaneous Spanish and Catalan
Miguel Simonet1,  José Ignacio Hualde2,  Marianna Nadeu3
1University of Arizona, 2University of Illinois, 3University of Ilinois


 

Paper Identifier: Wed.O6d.01

Wednesday

10:00 - 10:20  Pavilion West

 

HMM Synthesis II

 

 

Wideband Parametric Speech Synthesis Using Warped Linear Prediction
Tuomo Raitio1,  Antti Suni2,  Martti Vainio2,  Paavo Alku1
1Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland, 2Department of Behavioural Sciences, University of Helsinki, Helsinki, Finland


 

Paper Identifier: Wed.O6d.02

Wednesday

10:20 - 10:40  Pavilion West

 

HMM Synthesis II

 

 

Modeling the Creaky Excitation for Parametric Speech Synthesis
Thomas Drugman1,  John Kane2,  Christer Gobl2
1University of Mons, Belgium, 2Trinity College Dublin, Ireland


 

Paper Identifier: Wed.O6d.03

Wednesday

10:40 - 11:00  Pavilion West

 

HMM Synthesis II

 

 

Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
Wen Zhengqi and Tao Jianhua
Institute of Automation, Chinese Academy of Sciences, Beijing, China


 

Paper Identifier: Wed.O6d.04

Wednesday

11:00 - 11:20  Pavilion West

 

HMM Synthesis II

 

 

Speech synthesis using a non-maximally decimated filter bank for embedded systems
Nobuyuki Nishizawa and Tsuneo Kato
KDDI R&D Laboratories, Inc., Japan


 

Paper Identifier: Wed.O6d.05

Wednesday

11:20 - 11:40  Pavilion West

 

HMM Synthesis II

 

 

Ways to Implement Global Variance in Statistical Speech Synthesis
Hanna Silen,  Elina Helander,  Jani Nurminen,  Moncef Gabbouj
Tampere University of Technology, Finland


 

Paper Identifier: Wed.O6d.06

Wednesday

11:40 - 12:00  Pavilion West

 

HMM Synthesis II

 

 

HMM-based speech synthesis using sub-band basis spectrum model
Yamato Ohtani,  Masatsune Tamura,  Masahiro Morita,  Takehiko Kagoshima,  Masami Akamine
Toshiba Corporation, Japan


 

Paper Identifier: Wed.SS6.01

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Resonator-based creaky voice detection
Thomas Drugman1,  John Kane2,  Christer Gobl2
1University of Mons, Belgium, 2Trinity College Dublin, Ireland


 

Paper Identifier: Wed.SS6.02

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Effect of Tongue Tip Trilling on the Glottal Excitation Source
V. K. Mittal,  N. Dhananjaya,  B. Yegnanarayana
International Institute of Information Technologoy, Hyderabad, India


 

Paper Identifier: Wed.SS6.03

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Estimating the voice source in noise
Gang Chen1,  Yen-Liang Shue2,  Jody Kreiman1,  Abeer Alwan1
1University of California, Los Angeles, USA, 2Dolby Australia, Australia


 

Paper Identifier: Wed.SS6.04

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Voice source analysis using biomechanical modeling and glottal inverse filtering
Alan Pinheiro1,  Tuomo Raitio2,  Danyane Gomes3,  Paavo Alku2
1Federal University of São João del Rei, Brazil, 2Aalto University, Finland, 3University Center of Patos de Minas, Brazil


 

Paper Identifier: Wed.SS6.05

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Speech modeling and processing by low-dimensional dynamic glottal models
Carlo Drioli1 and Andrea Calanca2
1University of Udine, Italy, 2University of Verona, Italy


 

Paper Identifier: Wed.SS6.06

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction
Paavo Alku1,  Jouni Pohjalainen1,  Martti Vainio2,  Anne-Maria Laukkanen3,  Brad Story4
1Aalto University, Finland, 2University of Helsinki, Finland, 3University of Tampere, Finland, 4University of Arizona, USA


 

Paper Identifier: Wed.SS6.07

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Automatic Topology Generation of Glottal Source HMM
Akira Sasou
National Institute of Advanced Industrial Science and Technology (AIST)


 

Paper Identifier: Wed.SS6.08

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Towards Glottal Source Controllability in Expressive Speech Synthesis
Jaime Lorenzo-Trueba1,  Roberto Barra-Chicote1,  Tuomo Raitio2,  Nicolas Obin3,  Paavo Alku2,  Junichi Yamagishi4,  Juan M Montero5
1Speech Technology Group, ETSI Telecomunicacion, Universidad Politecnica de Madrid, 2Department of Signal Processing and Acoustics, Aalto University, Finland, 3Sound Analysis and Synthesis, IRCAM, Paris, France, 4CSTR, University of Edinburgh, United Kingdom, 5Speech Technology Group, ETSI Telecomunicacion, Universidad Politecnica de Madrid, Spain


 

Paper Identifier: Wed.SS6.09

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech
Ali Alpan1,  Jean Schoentgen2,  Francis Grenez1
1Université Libre de Bruxelles, Belgium, 2National Fund for Scientific Research, Belgium


 

Paper Identifier: Wed.SS6.10

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

A Preliminary Study on Cross-Databases Emotion Recognition using the Glottal Features in Speech
Rui Sun and Elliot Ii Moore
Georgia Institute of Technology, US


 

Paper Identifier: Wed.SS6.11

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis
Ranniery Maia
Toshiba Research Europe Limited


 

Paper Identifier: Wed.SS6.12

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Analysis of vocal tremor and jitter by empirical mode decomposition of glottal cycle length time series
Christophe Mertens1,  Francis Grenez1,  Jean Schoentgen2
1Universite Libre de Bruxelles, Belgium, 2National Fund of Scientific Research, Belgium


 

Paper Identifier: Wed.SS6.13

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse Filtering
Harri Auvinen1,  Tuomo Raitio2,  Samuli Siltanen1,  Paavo Alku2
1Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland, 2Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland


 

Paper Identifier: Wed.SS6.14

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Glottal source shape parameter estimation using phase minimization variants
Stefan Huber1,  Axel Roebel2,  Gilles Degottex3
1Germany, 2France, 3Greece


 

Paper Identifier: Wed.SS6.15

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Glottal Waveform Analysis of Physical Task Stress Speech
Keith W. Godin,  Taufiq Hasan,  John H. L. Hansen
Univ. of Texas at Dallas, USA


 

Paper Identifier: Wed.SS6.16

Wednesday

10:00 - 12:00  Galleria

 

Glottal Source Processing: from Analysis to Applications

 

 

Speaker Discrimination Ability of Glottal Waveform Features
Juan Torres1 and Elliot Moore2
1Polytechnic University, USA, 2Georgia Institute of Technology, USA


 

Paper Identifier: Wed.P6a.01

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Average Spectrotemporal Structure of Continuous Speech Matches with the Frequency Resolution of Human Hearing
Okko Räsänen
Department of Signal Processing and Acoustics, Aalto University, Finland


 

Paper Identifier: Wed.P6a.02

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Perceptual Importance of the Phase Related Information in Speech
Ibon Saratxaga1,  Inma Hernaez1,  Michael Pucher2,  Eva Navas1,  Iñaki Sainz1
1University of the Basque Country, Spain, 2Telecommunication Research Center Vienna, Austria


 

Paper Identifier: Wed.P6a.03

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Improving the Entropy Estimate of Neuronal Firings of Modeled Cochlear Nucleus Neurons
Andrea Grigorescu,  Marek Rudnicki,  Michael Isik,  Werner Hemmert,  Stefano Rini
TUM


 

Paper Identifier: Wed.P6a.04

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Perception of Synthetic Speech in Adult Users of Cochlear Implants
Kyoko Nagao1,  Mark Paullin1,  James B. Polikoff1,  Jason Lilley2,  H. Timothy Bunnell1
1Nemours Biomedical Research, USA, 2University of Delaware, USA


 

Paper Identifier: Wed.P6a.05

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Hearing Loss and the Use of Acoustic Cues in Phonetic Categorisation of Fricatives
Odette Scharenborg1 and Esther Janse2
1Max Planck Institute for Psycholinguistics, the Netherlands, 2Centre for Language Studies, Radboud University Nijmegen, the Netherlands


 

Paper Identifier: Wed.P6a.06

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Intelligibility of speech spoken in noise/reverberation for older adults in reverberant environments
Nao Hodoshima1,  Takayuki Arai2,  Kiyohiro Kurisu3
1Department of Information Media Technology, Tokai University, Tokyo, Japan, 2Department of Information and Communication Sciences, Sophia University, Tokyo, Japan, 3TOA Corporation, Hyogo, Japan


 

Paper Identifier: Wed.P6a.07

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Improved Speech Intelligibility with a Chimaera Hearing Aid Algorithm
Andrew Hines and Naomi Harte
Trinity College Dublin


 

Paper Identifier: Wed.P6a.08

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Unsupervised Acoustic Analyses of Normal and Lombard Speech, with Spectral Envelope Transformation to Improve Intelligibility
Elizabeth Godoy and Yannis Stylianou
FORTH-ICS, Greece


 

Paper Identifier: Wed.P6a.09

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

The effect of dichotic processing on the perception of binaural cues
Akiko Amano-Kusumoto,  Justin M. Aronoff,  Motokuni Itoh,  Sigfrid D. Soli
House Research Institute


 

Paper Identifier: Wed.P6a.10

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

Speech and speaker separation in human auditory cortex
Nima Mesgarani and Edward Chang
UCSF


 

Paper Identifier: Wed.P6a.11

Wednesday

10:00 - 12:00  Exhibition Hall

 

Hearing

 

 

On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues alone
Jens Edlund1,  Mattias Heldner2,  Joakim Gustafson1
1KTH Speech, Music and Hearing, Sweden, 2Linguistics, Stockolm University, Sweden


 

Paper Identifier: Wed.P6b.01

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

SIBILANT SPEECH DETECTION IN NOISE
Sira Gonzalez and Mike Brookes
Imperial College London, UK


 

Paper Identifier: Wed.P6b.02

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

Voice Activity Detection Using Speech Recognizer Feedback
Kit Thambiratnam,  Weiwu Zhu,  Frank Seide
Beijing, China


 

Paper Identifier: Wed.P6b.03

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

Descriptive Vocabulary Development for Degraded Speech
Dushyant Sharma,  Gaston Hilkhuysen,  Patrick A. Naylor,  Nikolay D. Gaubitch,  Mark Huckvale,  Mike Brookes
CLEAR


 

Paper Identifier: Wed.P6b.04

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum Similarity
Ryo Yokoyama1,  Yu Nasu1,  Koichi Shinoda1,  Koji Iwano2
1Tokyo Institute of Technology, Japan, 2Tokyo City University, Japan


 

Paper Identifier: Wed.P6b.05

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

Speech restoration based on deep learning autoencoder with layer-wised pretraining
Xugang Lu,  Shigeki Matsuda,  chiori Hori,  Hideki Kashioka
NICT, Japan


 

Paper Identifier: Wed.P6b.06

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

DETECTION AND POSITIONING OF OVERLAPPED SOUNDS IN A ROOM ENVIRONMENT
Rupayan Chakraborty,  Climent Nadeu,  Taras Butko
UPC, Barcelona, Spain


 

Paper Identifier: Wed.P6b.07

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

Foreground Speech Segmentation using Zero Frequency Filtered Signal
Deepak K. T.,  Biswajit Dev Sarma,  S. R. Mahadeva Prasanna
Indian Institute of Technology Guwahati, India


 

Paper Identifier: Wed.P6b.08

Wednesday

10:00 - 12:00  Exhibition Hall

 

Degraded Speech and Enhancement

 

 

The Effect of Spectral Estimator on Common Spectral Measures for Sibilant Fricatives
Patrick Reidy and Mary Beckman
Ohio State University, USA


 

Paper Identifier: Wed.P6c.01

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
Emad M. Grais and Hakan Erdogan
Sabanci University, Turkey


 

Paper Identifier: Wed.P6c.02

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Speaker Independent Single Channel Source Separation using Sinusoidal Features
Shivesh Ranjan1,  Karen Payton1,  Pejman Mowlaee2
1Electrical & Computer Engineering Dept., University of Massachusetts Dartmouth, North Dartmouth, MA 02747, USA, 2Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum, Germany


 

Paper Identifier: Wed.P6c.03

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Boosting Classification Based Speech Separation Using Temporal Dynamics
Yuxuan Wang and DeLiang Wang
The Ohio State University, Dept. of CSE


 

Paper Identifier: Wed.P6c.04

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Acoustic Features for Classification Based Speech Separation
Yuxuan Wang,  Kun Han,  DeLiang Wang
The Ohio State University, Dept. of CSE


 

Paper Identifier: Wed.P6c.05

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
Emad M. Grais and Hakan Erdogan
Sabanci University, Turkey


 

Paper Identifier: Wed.P6c.06

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Unconstrained Speech Separation by Composition of Longest Segments
Ming Ji,  Ramji Srinivasan,  Danny Crookes
Queen's University Belfast, UK


 

Paper Identifier: Wed.P6c.07

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Modulation domain blind source separation for noisy speech mixture
Yi Zhang and Yunxin Zhao
Univ. of Missouri, US


 

Paper Identifier:Wed.P6c.08

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Phase estimation for signal reconstruction in single-channel source separation
Pejman Mowlaee1,  Rahim Saeidi2,  Rainer Martin1
1Ruhr Universitet Bochum, Germany, 2Centre for Language and Speech Technology, Radboud University Nijmegen


 

Paper Identifier: Wed.P6c.09

Wednesday

10:00 - 12:00  Exhibition Hall

 

Source Separation and Computational Auditory Scene Analysis

 

 

Bayesian Group Sparse Learning for Nonnegative Matrix Factorization
Jen-Tzung Chien and Hsin-Lung Hsieh
National Cheng Kung University, Taiwan


 

Paper Identifier: Wed.P6d.01

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Residual Phase Cepstrum Coefficients with Application to Cross-lingual Speaker Verification
Michael Johnson and Jianglin Wang
Marquette University, United States


 

Paper Identifier: Wed.P6d.02

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Speaker Verification Using Neighborhood Preserving Embedding
Chunyan Liang,  Jinchao Yang,  Lin Yang,  Yonghong Yan
Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics, Chinese Academy of Sciences,China


 

Paper Identifier: Wed.P6d.03

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification
Chunyan Liang,  Xiang Zhang,  Lin Yang,  Yonghong Yan
Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics, Chinese Academy of Sciences


 

Paper Identifier: Wed.P6d.04

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis
Taufiq Hasan and John Hansen
The University of Texas at Dallas


 

Paper Identifier: Wed.P6d.05

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Factor Analysis and Nuisance Attribute Projection Revisited
Lukas Machlica and Zbynek Zajic
Faculty of Applied Sciences, University of West Bohemia, Czech Republic


 

Paper Identifier: Wed.P6d.06

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification
Sheng Chen and Mingxing Xu
Tsinghua University, Beijing, China


 

Paper Identifier: Wed.P6d.07

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases
Anthony Larcher,  Kong Aik Lee,  Bin Ma,  Haizhou Li
Institute for Infocomm Research, A-STAR (Singapore)


 

Paper Identifier: Wed.P6d.08

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Speaker idiosyncratic rhythmic features in the speech signal
Volker Dellwo,  Adrian Leemann,  Marie-José Kolly
University of Zurich, Switzerland


 

Paper Identifier: Wed.P6d.09

Wednesday

10:00 - 12:00  Exhibition Hall

 

Speaker Recognition II

 

 

Bilinear Factor Analysis for iVector Based Speaker Verification
Yun Lei,  Lukas Burget,  Nicolas Scheffer
SRI International


 

Paper Identifier: Wed.O7a.01

Wednesday

13:30 - 13:50  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

Paraphrastic Language Models
Xunying Liu,  Mark Gales,  Phil Woodland
Cambridge University


 

Paper Identifier: Wed.O7a.02

Wednesday

13:50 - 14:10  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

Efficient Structured Language Modeling for Speech Recognition
Ariya Rastrow,  Mark Dredze,  Sanjeev Khudanpur
Human Language Technology Center of Excellence, and Center for Language and Speech Processing, Johns Hopkins University


 

Paper Identifier: Wed.O7a.03

Wednesday

14:10 - 14:30  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

Towards Recurrent Neural Networks Language Models with Linguistic and Contextual Features
Yangyang Shi,  Pascal Wiggers,  Catholijn M Jonker
Delft University of Technology, Intelligent Systems, Interactive Intelligence, The Netherlands


 

Paper Identifier: Wed.O7a.04

Wednesday

14:30 - 14:50  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition
Gwénolé Lecorvé and Petr Motlicek
Idiap Research Institute, Switzerland


 

Paper Identifier: Wed.O7a.05

Wednesday

14:50 - 15:10  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

Large Scale Hierarchical Neural Network Language Models
Hong-Kwang Kuo1,  Ebru Arisoy1,  Ahmad Emami1,  Paul Vozila2
1IBM T.J. Watson Research Center, U.S.A., 2Nuance Communications, U.S.A.


 

Paper Identifier: Wed.O7a.06

Wednesday

15:10 - 15:30  Grand Ballroom I

 

Language Modeling: New Models and Features

 

 

A Sparse Plus Low Rank Maximum Entropy Language Model
Brian Hutchinson,  Mari Ostendorf,  Maryam Fazel
University of Washington, USA


 

Paper Identifier: Wed.O7b.01

Wednesday

13:30 - 13:50  Grand Ballroom II

 

Speaker Verification

 

 

PLDA Modeling in I-Vector and Supervector Space for Speaker Verification
Ye Jiang1,  Kong Aik Lee2,  Zhenmin Tang1,  Bin Ma2,  Anthony Larcher2,  Haizhou Li2
1School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing,China, 2Human Language Technology Department, Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore


 

Paper Identifier: Wed.O7b.02

Wednesday

13:50 - 14:10  Grand Ballroom II

 

Speaker Verification

 

 

Supervized Mixture of PLDA Models for Cross-Channel Speaker Verification
Konstantin Simonchik,  Timur Pekhovsky,  Andrey Shulipa,  Anton Afanasyev
Speech Technology Center Ltd., St. Petersburg, Russia


 

Paper Identifier: Wed.O7b.03

Wednesday

14:10 - 14:30  Grand Ballroom II

 

Speaker Verification

 

 

Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals
Federico Alegre,  Ravichander Vipperla,  Nicholas Evans
EURECOM, France


 

Paper Identifier: Wed.O7b.04

Wednesday

14:30 - 14:50  Grand Ballroom II

 

Speaker Verification

 

 

PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification
Themos Stafylakis1,  Patrick Kenny2,  Mohammed Senoussaoui1,  Pierre Dumouchel1
1CRIM and ETS, Canada, 2CRIM, Canada


 

Paper Identifier: Wed.O7b.05

Wednesday

14:50 - 15:10  Grand Ballroom II

 

Speaker Verification

 

 

Mean Hilbert Envelope Coefficients (MHEC) for Robust Speaker Recognition
Seyed Omid Sadjadi,  Taufiq Hasan,  John H.L. Hansen
The University of Texas at Dallas, USA


 

Paper Identifier: Wed.O7b.06

Wednesday

15:10 - 15:30  Grand Ballroom II

 

Speaker Verification

 

 

Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition
Zhizheng Wu1,  Eng Siong Chng1,  Haizhou Li2
1School of Computer Engineering, Nanyang Technological University, Singapore, 2Human Language Technology Department, Institute for Infocomm Research , Singapore


 

Paper Identifier: Wed.O7c.01

Wednesday

13:30 - 13:50  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

Maximising objective speech intelligibility by local f0 modulation
Julian Villegas1 and Martin Cooke2
1University of the Basque Country, Spain, 2Ikerbasque (Basque Foundation for Science), Spain


 

Paper Identifier: Wed.O7c.02

Wednesday

13:50 - 14:10  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

Effect of prosodic changes on speech intelligibility
Catherine Mayo1,  Vincent Aubanel2,  Martin Cooke3
1CSTR, University of Edinburgh, Edinburgh, UK, 2Language and Speech Laboratory, University of the Basque Country, Vitoria, Spain, 3Ikerbasque (Basque Science Foundation), Bilbao, Spain


 

Paper Identifier: Wed.O7c.03

Wednesday

14:10 - 14:30  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

Effects of visual speech information on native listener judgments of L2 consonant intelligibility
Saya Kawase and Yue Wang
Simon Fraser University, Canada


 

Paper Identifier: Wed.O7c.04

Wednesday

14:30 - 14:50  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

Perceptual compensation for the effects of reverberation on consonant identification: A comparison of human and machine performance
Guy Brown1,  Amy Beeston1,  Kalle Palomaki2
1University of Sheffield, UK, 2Aalto University, Finland


 

Paper Identifier: Wed.O7c.05

Wednesday

14:50 - 15:10  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

The Intelligibility of Lombard Speech: Communicative setting matters
Michael Fitzpatrick,  Jeesun Kim,  Chris Davis
MARCS Institute


 

Paper Identifier: Wed.O7c.06

Wednesday

15:10 - 15:30  Pavilion East

 

Speech Intelligibility in Quiet and in Noise

 

 

Performance Comparison of Intrusive Objective Speech Intelligibility and Quality Metrics for Cochlear Implant Users
João Felipe Santos1,  Stefano Cosentino2,  Oldooz Hazrati3,  Philipos C. Loizou3,  Tiago H. Falk1
1Institut National de la Recherche Scientifique, INRS-EMT, Canada, 2Ear Institute, University College London (UCL), UK, 3Department of Electrical Engineering, The University of Texas at Dallas, USA


 

Paper Identifier: Wed.SS7.01

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

The Speech Recognition Virtual Kitchen: An Initial Prototype
Florian Metze1 and Eric Fosler-Lussier2
1Carnegie Mellon University, USA, 2The Ohio State University, USA


 

Paper Identifier: Wed.SS7.02

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

PermA and Balloon: Tools for string alignment and text processing
Uwe Reichel
University of Munich, Germany


 

Paper Identifier: Wed.SS7.03

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

VisArtico: a visualization tool for articulatory data
Slim Ouni1,  Loic Mangeonjean2,  Ingmar Steiner3
1LORIA, France, 2University of Lorraine, France, 3INRIA, France


 

Paper Identifier: Wed.SS7.04

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

Towards Automated Annotation of Audio and Video Recordings by Application of Advanced Web-services.
Przemyslaw Lenkiewicz,  Dieter van Uytvanck,  Peter Wittenburg,  Sebastian Drude
Max Planck Institute for Psycholinguistics, Netherlands


 

Paper Identifier: Wed.SS7.05

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

A Rule Based Pronunciation Generator and Regional Accent Databank for Portuguese
Simone Ashby1,  Sílvia Barbosa1,  Silvia Brandão2,  José Pedro Ferreira1,  Maarten Janssen3,  Catarina Silva1,  Mário Eduardo Viaro4
1Instituto de Linguística Teórica e Computacional (ILTEC), Portugal, 2Universidade Federal de Rio de Janeiro (UFRJ), Brazil, 3IULA, Universitat Pompeu Fabre, Spain, 4Universidade de São Paulo, Brazil


 

Paper Identifier: Wed.SS7.06

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based Smart Phones and Tablets
Roger Chappel and Kuldip Paliwal
Signal Processing Laboratory, Griffith University, Australia


 

Paper Identifier: Wed.SS7.07

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

ProTK: An Improved Prosody Toolkit
Jacob Okamoto1,  Serguei Pakhomov1,  Elizabeth Shriberg2,  Andreas Stolcke2
1University of Minnesota, USA, 2Microsoft, USA


 

Paper Identifier: Wed.SS7.08

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

SpeechMark: Landmark Detection Tool for Speech Analysis
Suzanne Boyce1,  Harriet Fell2,  Joel MacAuslan3
1University of Cincinnati, USA, 2Northeastern University, USA, 3Speech Technology and Applied Research, USA


 

Paper Identifier: Wed.SS8.07

Wednesday

13:30 - 15:30  Galleria

 

Speech Tools Demo

 

 

An On-Line, Cloud-Based Spanish-Spanish Sign Language Translation System
Javier Tejedor,  Fernando Lopez-Colino,  Jordi Porta,  Jose Colas
Human Computer Technology Laboratory UAM


 

Paper Identifier: Wed.P7a.01

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia
Sourish Chaudhuri,  Rita Singh,  Bhiksha Raj
LTI, CMU, USA


 

Paper Identifier: Wed.P7a.02

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Time Delay Estimation for Speech Signal Based on FOC-Spectrum
Hong Liu1 and Xiaofei Li2
1Key Laboratory of Machine Perception and Intelligence, Peking University, Shenzhen, CHINA, 2Key Laboratory of Integrated Micro-system, Peking University, Shenzhen, CHINA


 

Paper Identifier: Wed.P7a.03

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints
Ziqiang Shi,  Tieran Zheng,  Jiqing Han,  Shiwen Deng
Harbin Institute of Technology, china


 

Paper Identifier: Wed.P7a.04

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

GCC-PHAT based Head Orientation Estimation
Carlos Segura1 and Javier Hernando2
1Herta Security, S.L., 2Universitat Politècnica de Catalunya


 

Paper Identifier: Wed.P7a.05

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Plagiarism Detection in Polyphonic Music using Monaural Signal Separation
Soham De1,  Indradyumna Roy1,  Tarunima Prabhakar2,  Kriti Suneja3,  Sourish Chaudhuri4,  Rita Singh4,  Bhiksha Raj4
1Computer Science & Engineering Department, Jadavpur University, India, 2Dhirubhai Ambani Institute of Information and Communication Technology, India, 3The LNM Institute of Information Technology, India, 4Language Technologies Institute, Carnegie Mellon University, USA


 

Paper Identifier: Wed.P7a.06

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

TDOA ESTIMATION FOR MULTIPLE SPEAKERS IN UNDERDETERMINED CASE
mariem BOUAFIF1 and Zied LACHIRI2
1LSTS-SIFI Laboratory, National Engineering School of Tunis, Tunisia, 2Depart. of Physic and Instrumentation, National Institute of Applied Sciences and Technology,Tunis, Tunisia


 

Paper Identifier: Wed.P7a.07

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification
Toru Nakashika1,  Christophe Garcia2,  Tetsuya Takiguchi1
1Department of System Informatics, Kobe University, Japan, 2LIRIS, CNRS, Insa de Lyon, France


 

Paper Identifier: Wed.P7a.08

Wednesday

13:30 - 15:30  Exhibition Hall

 

Audio Analysis, Estimation and Classification

 

 

Training Deep Nets with Imbalanced and Unlabeled Data
Jeff Berry1,  Ian Fasel2,  Luciano Fadiga3,  Diana Archangeli2
1University of Arizona, USA and Istituto Italiano di Tecnologia, Italy, 2University of Arizona, USA, 3Istituto Italiano di Tecnologia and University of Ferrara, Italy


 

Paper Identifier: Wed.P7b.01

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

Speech Data Clustering Based on Phoneme Error Trend for Unsupervised Acoustic Model Adaptation
Taichi Asami,  Satoshi Kobashikawa,  Hirokazu Masataki,  Osamu Yoshioka,  Satoshi Takahashi
NTT Cyber Space Laboratories, NTT Corporation, Japan


 

Paper Identifier: Wed.P7b.02

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments
Wooil Kim and John H. L. Hansen
University of Texas at Dallas, USA


 

Paper Identifier: Wed.P7b.03

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

INVESTIGATING PERFORMANCE OF THE DISCRIMINATIVE METHODS FOR LONG-TERM SPEAKER ADAPTATION
Danning Jiang1,  Dimitri Kanevsky2,  Vaibhava Goel2,  Yong Qin3
1IBM China Research Lab, China, 2IBM Watson Research Center, the United States, 3IBM China Resesarch Lab, China


 

Paper Identifier: Wed.P7b.04

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition
Bo LI and Khe Chai SIM
National University of Singapore, Singapore


 

Paper Identifier: Wed.P7b.05

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

A comparative study of adaptive, automatic recognition of disordered speech
Heidi Christensen,  Stuart Cunningham,  Charles Fox,  Phil Green,  Thomas Hain
University of Sheffield, United Kingdom


 

Paper Identifier: Wed.P7b.06

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

PHONEME CLASS BASED ADAPTATION FOR MISMATCH ACOUSTIC MODELING OF DISTANT NOISY SPEECH
Seckin Uluskan and John Hansen
Center for Robust Speech Systems, University of Texas at Dallas, USA


 

Paper Identifier: Wed.P7b.07

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition
Zoi Roupakia,  Anton Ragni,  Mark Gales
Engineering Department, University of Cambridge, U.K.


 

Paper Identifier: Wed.P7b.08

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

A Study on Using Word-Level HMMs to Improve ASR Performance over State-of-the-Art Phone-Level Acoustic Modeling for LVCSR
I-Fan Chen and Chin-Hui Lee
School of Electrical and Computer Engineering, Georgia Institute of Technology, USA


 

Paper Identifier: Wed.P7b.09

Wednesday

13:30 - 15:30  Exhibition Hall

 

Adaptation for ASR

 

 

Factored adaptation using a combination of feature-space and model-space transforms
Michael Seltzer and Alex Acero
Microsoft Research, USA


 

Paper Identifier: Wed.P7c.01

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Exploring Discriminative Speech Trajectory Structures
Heyun Huang,  Louis ten Bosch,  Bert Cranen,  Lou Boves
Radboud University Nijmegen


 

Paper Identifier: Wed.P7c.02

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Estimating Classifier Performance in Unknown Noise
Ehsan Variani and Hynek Hermansky
Johns Hopkins University, USA


 

Paper Identifier: Wed.P7c.03

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Continuous Digit Recognition in Noise: Reservoirs can do an excellent job!
Azarakhsh Jalalvand,  Fabian Triefenbach,  Jean-Pierre Martens
Ghent University, Belguim


 

Paper Identifier: Wed.P7c.04

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Optimization-Based Control for the Extended Baum-Welch Algorithm
Janne Pylkkönen and Mikko Kurimo
Aalto University, Finland


 

Paper Identifier: Wed.P7c.05

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems
Marc René Schädler and Birger Kollmeier
Medical Physics / Carl-von-Ossietzky University Oldenburg, Germany


 

Paper Identifier: Wed.P7c.06

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Phone recognition in critical bands using sub-band temporal modulations
Feipeng Li,  Sri Harish Mallidi,  Hynek Hermansky
Johns Hopkins University, U.S.


 

Paper Identifier: Wed.P7c.07

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Combining Acoustic Data Driven G2P and Letter-to-Sound Rules for Under Resource Lexicon Generation
Ramya Rasipuram and Mathew Magimai Doss
Idiap Research Institute, Martigny, Switerland


 

Paper Identifier: Wed.P7c.08

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

CRF-based Diacritisation of Colloquial Arabic for Automatic Speech Recognition
Sarah Al-Shareef and Thomas Hain
University of Sheffield, UK


 

Paper Identifier: Wed.P7c.09

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

Analysis of Temporal Resolution in Frequency Domain Linear Prediction
Sriram Ganapathy and Hynek Hermanksy
Dept. of ECE, Johns Hopkins University, USA


 

Paper Identifier: Wed.P7c.10

Wednesday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition I

 

 

White Listing and Score Normalization for Keyword Spotting of Noisy Speech
Bing Zhang,  Richard Schwartz,  Stavros Tsakalidis,  Long Nguyen,  Spyros Matsoukas
Raytheon BBN Technologies, USA


 

Paper Identifier: Wed.P7d.01

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Speaker Recognition for Children’s Speech
SAEID SAFAVI,  MARYAM NAJAFIAN,  ABUALSOUD HANANI,  MARTIN RUSSELL,  PETER JANCOVIC,  MICHAEL CAREY
UNIVERSITY OF BIRMINGHAM, UK


 

Paper Identifier: Wed.P7d.02

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

A simple and efficient method to align very long speech signals to acoustically imperfect transcriptions
Germán Bordel,  Mikel Penagarikano,  Luis Javier Rodríguez-Fuentes,  Amparo Varona
Department of Electricity and Electronics- Faculty of Science and Technology- University of the Basque Country, Spain


 

Paper Identifier: Wed.P7d.03

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Estimation of Talker’s Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients
Ryoichi Takashima,  Tetsuya Takiguchi,  Yasuo Ariki
Kobe University, Japan


 

Paper Identifier: Wed.P7d.04

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Sentence Detection Using Multiple Annotations
Ann Lee and James Glass
MIT Computer Science and Artificial Intelligence Laboratory, USA


 

Paper Identifier: Wed.P7d.05

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

A speaker-role based approach for detecting politicians in TV broadcast news
Delphine Charlet and Geraldine Damnati
France


 

Paper Identifier: Wed.P7d.06

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Relative Importance of Temporal Envelope and Fine Structure Cues in Low- and High-Order Harmonic Regions for Mandarin Lexical-tone Recognition
Guangting Mai
Language Engineering Laboratory, Department of Electronic Engineering, The Chinese University of Hong Kong


 

Paper Identifier: Wed.P7d.07

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment
Nitya Tiwari1,  Prem C. Pandey1,  Pandurangarao N. Kulkarni2
1Indian Institute of Technology Bombay, India, 2Basaveshwar Engineering College Bagalkot, India


 

Paper Identifier: Wed.P7d.08

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Word Prominence Detection using Robust yet Simple Prosodic Features
Taniya Mishra,  Vivek Rangarajan Sridhar,  Alistair Conkie
AT&T Labs-Research, USA


 

Paper Identifier: Wed.P7d.09

Wednesday

13:30 - 15:30  Exhibition Hall

 

Rich Transcription II

 

 

Online Story Segmentation of Multilingual Streaming Broadcast News
Amit Srivastava,  Saurabh Khanwalkar,  Gretchen Markiewicz,  Guruprasad Saikumar
Raytheon BBN Technologies, USA


 

Paper Identifier: Wed.O8a.01

Wednesday

16:00 - 16:20  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

Efficient Segmental Conditional Random Fields for One-Pass Phone Recognition
Yanzhang He and Eric Fosler-Lussier
The Ohio State University, USA


 

Paper Identifier: Wed.O8a.02

Wednesday

16:20 - 16:40  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

Enhanced Polyphone Decision Tree Adaptation for Accented Speech Recognition
Udhyakumar Nallasamy1,  Florian Metze1,  Tanja Schultz2
1CMU, USA, 2KIT, Germany


 

Paper Identifier: Wed.O8a.03

Wednesday

16:40 - 17:00  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

Efficient VTS Adaptation Using Jacobian Approximation
Jinyu Li,  Michael L. Seltzer,  Yifan Gong
Microsoft, U.S.


 

Paper Identifier: Wed.O8a.04

Wednesday

17:00 - 17:20  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

Robust triphone mapping for acoustic modeling
Milos Cernak,  David Imseng,  Herve Boulard
Idiap, Switzerland


 

Paper Identifier: Wed.O8a.05

Wednesday

17:20 - 17:40  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

sparse banded precision matrices for low resource speech recognition
Weibin Zhang1 and Pascale Fung2
1HKUST,Hongkong, 2HKUST, Hongkong


 

Paper Identifier: Wed.O8a.06

Wednesday

17:40 - 18:00  Grand Ballroom I

 

Adaptation & Robust Modeling

 

 

Semi-Blind Model Adaptation using Piece-wise Energy Decay Curve for Large Reverberant Environments
Abdul Waheed Mohammed1,  Marco Matassoni2,  HariKrishna Maganti2,  Maurizio Omologo2
1Universita degli studi di Trento, Italy, 2Fondazione Bruno Kessler Irst


 

Paper Identifier: Wed.O8b.01: WITHDRAWN

 

Paper Identifier: Wed.O8b.02

Wednesday

16:20 - 16:40  Grand Ballroom II

 

Multi-Channel Speech Enhancement

 

 

Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise
Keisuke Kinoshita,  Marc Delcroix,  Mehrez Souden,  Tomohiro Nakatani
NTT Communication Science Labs., Japan


 

Paper Identifier: Wed.O8b.03

Wednesday

16:40 - 17:00  Grand Ballroom II

 

Multi-Channel Speech Enhancement

 

 

A Fast-Converging Adaptive Frequency-Domain MVDR Beamformer for Speech Enhancement
Shengkui Zhao1 and Douglas Jones2
1Advanced Digital Sciences Center, Illinois at Singapore, Singapore, 2Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Illinois, USA


 

Paper Identifier: Wed.O8b.04

Wednesday

17:00 - 17:20  Grand Ballroom II

 

Multi-Channel Speech Enhancement

 

 

A signal-separation-based array postfilter for distant speech recognition
Rita Singh1,  Kenichi Kumatani2,  John McDonough1,  Chen Liu3
1Carnegie Mellon University, USA, 2Disney Research, USA, 3Spansion Inc., USA


 

Paper Identifier: Wed.O8b.05

Wednesday

17:20 - 17:40  Grand Ballroom II

 

Multi-Channel Speech Enhancement

 

 

Constrained Multichannel Speech Dereverberation
MENG YU1 and FRANK SOONG2
1University of California, Irvine, USA, 2Microsoft Research Asia


 

Paper Identifier: Wed.O8b.06

Wednesday

17:40 - 18:00  Grand Ballroom II

 

Multi-Channel Speech Enhancement

 

 

A Triple-Microphone Real-Time Speech Enhancement Algorithm Based on Approximate Array Analytical Solutions
MENG YU,  RYAN RITCH,  JACK XIN
University of California, Irvine, USA


 

Paper Identifier: Wed.O8c.01

Wednesday

16:00 - 16:20  Pavilion East

 

Prosody II

 

 

Perception of Pitch Contours among Native Tone Listeners
Ratree Wayland1,  Donruethai Laphasradakul2,  Edith Kaan1,  Cao Rui1
1University of Florida, USA, 2Unierity of the Thai Chamber of Commerce, Thailand


 

Paper Identifier: Wed.O8c.02

Wednesday

16:20 - 16:40  Pavilion East

 

Prosody II

 

 

Pitch range control of Japanese boundary pitch movements
Yosuke Igarashi1 and Hanae Koiso2
1Hiroshima University, Japan, 2National Institute of Japanese language and linguistics, Japan


 

Paper Identifier: Wed.O8c.03

Wednesday

16:40 - 17:00  Pavilion East

 

Prosody II

 

 

Perceived prosodic boundaries in Taiwanese and their acoustic correlates
Grace Kuo
UCLA Linguistics, USA


 

Paper Identifier: Wed.O8c.04

Wednesday

17:00 - 17:20  Pavilion East

 

Prosody II

 

 

Phonetic Foreignization of Mandarin for Dubbing in Imported Western Movies
Luying Hou1,  Yuan Jia2,  Aijun Li2
1Shanghai International Studies University, China, 2Chinese Academy of Social Sciences, China


 

Paper Identifier: Wed.O8c.05

Wednesday

17:20 - 17:40  Pavilion East

 

Prosody II

 

 

Prosodic contex-based analysis of disfluencies.
Helena Moniz1,  Fernando Batista2,  Isabel Trancoso3,  Ana Isabel Mata4
1INESC-ID/FLUL - Portugal, 2INESC-ID/ISCTE - Portugal, 3INESC-ID/IST - Portugal, 4FLUL - Portugal


 

Paper Identifier: Wed.O8c.06

Wednesday

17:40 - 18:00  Pavilion East

 

Prosody II

 

 

Describing the development of intonational categories using a target-oriented parametric approach
Britta Lintfert and Bernd Möbius
Saarland University, Germany


 

Paper Identifier: Wed.O8d.01

Wednesday

16:00 - 16:20  Pavilion West

 

Voice Activity Detection

 

 

Developing a Speech Activity Detection System for the DARPA RATS Program
Tim Ng1,  Bing Zhang1,  Long Nguyen1,  Spyros Matsoukas1,  Xinhui Zhou2,  Nima Mesgarani2,  Karel Vesely3,  Pavel Matejka3
1Raytheon BBN Technologies, USA, 2University of Maryland, USA, 3Brno University of Technology, Czech Republic


 

Paper Identifier: Wed.O8d.02

Wednesday

16:20 - 16:40  Pavilion West

 

Voice Activity Detection

 

 

Speech Activity Detection for Noisy Data Using Adaptation Techniques
Mohamed Omar
IBM T. J. Watson Research Center


 

Paper Identifier: Wed.O8d.03

Wednesday

16:40 - 17:00  Pavilion West

 

Voice Activity Detection

 

 

Speech/Nonspeech Segmentation in Web Videos
Ananya Misra
Google, USA


 

Paper Identifier: Wed.O8d.04

Wednesday

17:00 - 17:20  Pavilion West

 

Voice Activity Detection

 

 

On the use of Machine Learning Methods for Speech and Voicing Classification
Philip Harding and Ben Milner
University of East Anglia


 

Paper Identifier: Wed.O8d.05

Wednesday

17:20 - 17:40  Pavilion West

 

Voice Activity Detection

 

 

Acoustic and Data-driven Features for Robust Speech Activity Detection
Samuel Thomas1,  Sri Harish Mallidi1,  Thomas Janu1,  Hynek Hermansky1,  Nima Mesgarani2,  Xinhui Zhou2,  Shihab Shamma2,  Tim Ng3,  Bing Zhang3,  Long Nguyen3,  Spyros Matsoukas3
1JHU, USA, 2UMD, USA, 3BBN, USA


 

Paper Identifier: Wed.O8d.06

Wednesday

17:40 - 18:00  Pavilion West

 

Voice Activity Detection

 

 

A Two-step NMF Based Algorithm for Single Channel Speech Separation
Shuo Wang and Wenjun Wu
State Key Lab of Software Development Environment, Department of Computer Science and Engineering, Beihang University, Beijing, China


 

Paper Identifier: Wed.SS8.01

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

A tutorial dialogue system with unrestricted spoken input
Peter Bell,  Myroslava Dzikovska,  Amy Isard
University of Edinburgh


 

Paper Identifier: Wed.SS8.02

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

Integrating Adaptive Beam-forming and Auditory Features for Robust Large Vocabulary Speech Recognition
Xie Sun,  Peter Li,  Manli Zhu,  Qiru Zhou
Li Creative Technologies, Inc., USA


 

Paper Identifier: Wed.SS8.03

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

A Natural In-Car Speech Interface to Internet Services Using Hybrid ASR
Hansjörg Hofmann,  Ute Ehrlich,  Klaus Bader,  Ilona Nothelfer,  André Berton
Daimler AG, Germany


 

Paper Identifier: Wed.SS8.04

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

How Marni Helps English Language Learners Acquire Oral Reading Fluency
Ron Cole,  Daniel Bolanos,  Wayne Ward,  JT Carmer,  Eric Borts,  Edward Svirsky
Boulder Language Technologies, USA


 

Paper Identifier: Wed.SS8.05

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

Demonstration of Advanced Multi-Modal, Network-Centric Communication Management Suite
Victor Finomore
Air Force Research Laboratory, USAe


 

Paper Identifier: Wed.SS8.06

Wednesday

16:00 - 18:00  Galleria

 

Systems Demo

 

 

Dutch Automatic Speech Recognition on the Web: Towards a General Purpose System
Joris Pelemans,  Kris Demuynck,  Patrick Wambacq
KU Leuven, Belgium


 

Paper Identifier: Wed.SS8.07:  Moved to SS7.

 

Paper Identifier: Wed.P8a.01

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Meaning inhibition and sentence processing in Chinese: Evidence from negative priming
Michael C. W. Yip
The Hong Kong Institute of Education, Hong Kong


 

Paper Identifier: Wed.P8a.02

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality Similarity
Yusuke Ijima,  Mitsuaki Isogai,  Hideyuki Mizuno
NTT Corporation, Japan


 

Paper Identifier: Wed.P8a.03

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Gendered sound symbolism and masking effects in speech processing
Molly Babel1 and Grant McGuire2
1University of British Columbia, Canada, 2UC Santa Cruz, USA


 

Paper Identifier: Wed.P8a.04

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Modeling Cue Trading in Human Word Recognition
Louis ten Bosch1 and Odette Scharenborg2
1Radboud University Nijmegen, NL, 2MPI for Psycholinguistics, NL


 

Paper Identifier: Wed.P8a.05

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Accounting for Speech Rate in Spoken Word Recognition
David Li and Elsi Kaiser
University of Southern California


 

Paper Identifier: Wed.P8a.06

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

The processes underlying two frequent casual speech phenomena in Dutch: A production experiment
Iris Hanique and Mirjam Ernestus
Radboud University Nijmegen, the Nederlands; Max Planck Institute for Psycholinguistics, the Netherlands


 

Paper Identifier: Wed.P8a.07

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Intrinsic velocity differences of lip and jaw movements: preliminary results
Peter Birkholz1 and Phil Hoole2
1Clinic of Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen, Germany, 2Institute of Phonetics and Speech Processing, Ludwig-Maximilians-University Munich, Germany


 

Paper Identifier: Wed.P8a.08

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Co-occurrence of reduced word forms in natural speech
Malte Viebahn1,  Mirjam Ernestus2,  James McQueen2
1Max Planck Institute for Psycholinguistics, The Netherlands, 2Radboud University Nijmegen, The Netherlands


 

Paper Identifier: Wed.P8a.09

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Voice Production Mechanisms of Vibrato in Noh
Ikuyo Yoshinaga1 and Jiangping Kong2
1Department of Chinese Language and Literature, Peking University, China, 2Department of Chinese Language and Literature, Peking University, China / Center for Chinese Linguistics, Peking University, China


 

Paper Identifier: Wed.P8a.10

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Automatic detection of hypernasal speech signals using nonlinear and entropy measurements
Juan Rafael Orozco-Arroyave1,  Julián David Arias-Londoño1,  Jesús Francisco Vargas-Bonilla2,  Elmar Nöth3
1Universidad de Antioquia, Colombia, 2Universdiad de Antioquia, Colombia, 3Universität Erlangen-Nürnberg, Germany


 

Paper Identifier: Wed.P8a.11

Wednesday

16:00 - 18:00  Exhibition Hall

 

Perception and Production

 

 

Effects of the availability of visual information and presence of competing conversations on speech production
Vincent Aubanel1,  Martin Cooke1,  Emma Foster2,  Maria Luisa Garcia Lecumberri3,  Cassie Mayo4
1Language and Speech Laboratory, Ikerbasque and University of the Basque Country, Vitoria, Spain, 2Department of Human Communication Sciences, University of Sheffield, UK, 3Language and Speech Laboratory, University of the Basque Country, Vitoria, Spain, 4Centre for Speech Technology Research, University of Edinburgh, UK


 

Paper Identifier: Wed.P8b.01

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Constrained Maximum Mutual Information Dimensionality Reduction for Language Identification
Shuai Huang,  Glen Coppersmith,  Damianos Karakos
Johns Hopkins University, USA


 

Paper Identifier: Wed.P8b.02

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Phonotactic Language Recognition Using MLP Features
Mohamed Faouzi BenZeghiba,  Jean-Luc Gauvain,  Lori Lamel
LIMSI-CNRS


 

Paper Identifier: Wed.P8b.03

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

The EHU Systems for the NIST 2011 Language Recognition Evaluation
Mikel Penagarikano,  Amparo Varona,  Luis J. Rodriguez-Fuentes,  Mireia Diez,  German Bordel
University of the Basque Country UPV/EHU, Spain


 

Paper Identifier: Wed.P8b.04

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Study of Different Backends in a State-Of-the-Art Language Recognition System
Mikel Penagarikano,  Amparo Varona,  Mireia Diez,  Luis Javier Rodriguez-Fuentes,  German Bordel
University of the Basque Country UPV/EHU


 

Paper Identifier: Wed.P8b.05

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

On the Use of Non-Linear Polynomial Kernel SVMs in Language Recognition
Sibel Yaman1,  Jason Pelecanos2,  Mohamed K. Omar1
1IBM, US, 2IBM, IS


 

Paper Identifier: Wed.P8b.06

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Exemplar-Based Sparse Representation for Language Recognition on I-Vectors
Bing Jiang,  Yan Song,  Wu Guo,  Lirong Dai
University of Science and Technology of China


 

Paper Identifier: Wed.P8b.07

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Subspace-Based Feature Representation and Learning for Language Recognition
Yu-Chin Shih1,  Hung-Shin Lee2,  Hsin-Min Wang2,  Shyh-Kang Jeng1
1Department of Electrical Engineering, National Taiwan University, 2Institute of Information Science, Academia Sinica


 

Paper Identifier: Wed.P8b.08

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition
Changhuai You,  Haizhou Li,  Bin Ma,  Kong Aik Lee
Institute for Infocomm Research, Singapore


 

Paper Identifier: Wed.P8b.09

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Using Time-Synchronous Phone Co-occurrences in a SVM-Phonotactic Dialect Recognition System
Amparo Varona,  Mikel Penagarikano,  Luis Javier Rodriguez-Fuentes,  German Bordel,  Mireia Diez
University of the Basque Country


 

Paper Identifier: Wed.P8b.10

Wednesday

16:00 - 18:00  Exhibition Hall

 

Language and Accent Recognition

 

 

Nativeness Classification with Suprasegmental Features on the Accent Group Level
Mahnoosh Mehrabani1,  Joseph Tepperman2,  Emily Nava2
1Center for Robust Speech Systems, University of Texas at Dallas, USA, 2Rosetta Stone Labs, Boulder, Colorado, USA


 

Paper Identifier: Wed.P8c.01

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity
Huny-yi Lee,  Po-wei Chou,  Lin-shan Lee
National Taiwan University, Taiwan


 

Paper Identifier: Wed.P8c.02

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Consumer-level multimedia event detection through unsupervised audio signal modeling
Byungki Byun1,  Ilseo Kim2,  Sabato Marco Siniscalchi3,  Chin-Hui Lee2
1Georgia Institute of Technology, USA, 2Institute of Technology, USA, 3Kore University of Enna, Italy


 

Paper Identifier: Wed.P8c.03

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Event-based Video Retrieval Using Audio
Qin Jin,  Peter Schulam,  Shourabh Rawat,  Susanne Burger,  Duo Ding,  Florian Metze
CMU, US


 

Paper Identifier: Wed.P8c.04

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Compact Audio Representation for Event Detection in Consumer Media
Xiaodan Zhuang,  Stavros Tsakalidis,  Shuang Wu,  Pradeep Natarajan,  Rohit Prasad,  Prem Natarajan
Raytheon BBN Technologies


 

Paper Identifier: Wed.P8c.05

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

N-gram FST Indexing for Spoken Term Detection
Chao Liu1,  Dong Wang1,  Javier Tejedor2
1Center for Speech and Language Technologies, Tsinghua University, China, 2UAM, Spain


 

Paper Identifier: Wed.P8c.06

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Spoken Inquiry Discrimination Using Bag-of-Words for Speech-Oriented Guidance System
Haruka Majima1,  Rafael Torres1,  Yoko Fujita1,  Hiromichi Kawanami1,  Tomoko Matsui2,  Hiroshi Saruwatari1,  Kiyohiro Shikano1
1Graduate School of Information Science, Nara Institute of Science and Technology, Japan, 2Department of Statistical Modeling, The Institute of Statistical Mathematics, Japan


 

Paper Identifier: Wed.P8c.07

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Robust Event Detection From Spoken Content In Consumer Domain Videos
Stavros Tsakalidis,  Xiaodan Zhuang,  Roger Hsiao,  Shuang Wu,  Pradeep Natarajan,  Rohit Prasad,  Prem Natarajan
BBN Technologies, USA


 

Paper Identifier: Wed.P8c.08

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Bag-of-Audio-Words Approach for Multimedia Event Classification
Stephanie Pancoast1 and Murat Akbacak2
1SRI International, Stanford University, United States, 2SRI International, United States


 

Paper Identifier: Wed.P8c.09

Wednesday

16:00 - 18:00  Exhibition Hall

 

Voice Search and Spoken Document Retrieval

 

 

Improvements in Japanese Voice Search
Ken-ichi Iso1,  Edward Whittaker2,  Tadashi Emori1,  Junpei Miyake1
1Yahoo Japan Corporation, Japan, 2Inferret Limited, England


 

Paper Identifier: Thu.O9a.01

Thursday

10:00 - 10:20  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks
Tara Sainath,  David Nahamoo,  Dimitri Kanevsky,  Bhuvana Ramabhadran
IBM, USA


 

Paper Identifier: Thu.O9a.02

Thursday

10:20 - 10:40  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Advances in noise robust digit recognition using hybrid exemplar-based techniques
Jort Gemmeke and Hugo Van hamme
KU Leuven, Belgium


 

Paper Identifier: Thu.O9a.03

Thursday

10:40 - 11:00  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition
Antti Hurmalainen1,  Rahim Saeidi2,  Tuomas Virtanen1
1Tampere University of Technology, Finland, 2Radboud University Nijmegen, The Netherlands


 

Paper Identifier: Thu.O9a.04

Thursday

11:00 - 11:20  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR
Yang Sun1,  Bert Cranen1,  Jort F. Gemmeke2,  Louis ten Bosch1,  Lou Boves1,  Mathew M. Doss3
1Centre for Language and Speech Technology, 2Department ESAT, Katholieke Universiteit, 3Idiap Research Institute


 

Paper Identifier: Thu.O9a.05

Thursday

11:20 - 11:40  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Synthetic References for Template-based ASR using posterior features
Serena Soldo1,  Mathew Magimai Doss2,  Hervé Bourlard2
1Idiap Research Insitute, Switzerland, 2Idiap Research Institute, Switzerland


 

Paper Identifier: Thu.O9a.06

Thursday

11:40 - 12:00  Grand Ballroom I

 

Sparse, Template-Based Representations

 

 

Heterogeneous Convolutive Non-Negative Sparse Coding
Dong Wang1 and Javier Tejedor2
1Center for Speech and Language Technologies, Tsinghua University, 2Human Computer Technology Laboratory, Universidad Autonoma de Madrid


 

Paper Identifier: Thu.O9b.01

Thursday

10:00 - 10:20  Grand Ballroom II

 

Speaker Diarization

 

 

Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization
Jürgen Geiger1,  Ravichander Vipperla2,  Simon Bozonnet2,  Nicholas Evans2,  Björn Schuller1,  Gerhard Rigoll1
1TU München, Germany, 2EURECOM, France


 

Paper Identifier: Thu.O9b.02

Thursday

10:20 - 10:40  Grand Ballroom II

 

Speaker Diarization

 

 

Selection of TDOA Parameters for MDM Speaker Diarization
Beatriz Martínez-González1,  José M. Pardo1,  Julián D. Echeverry-Correa1,  José A. Vallejo-Pinto2,  Roberto Barra-Chicote1
1ETSIT-Universidad Politécnica de Madrid, Spain, 2University of Oviedo, Spain


 

Paper Identifier: Thu.O9b.03

Thursday

10:40 - 11:00  Grand Ballroom II

 

Speaker Diarization

 

 

Confidence for Speaker Diarization using PCA Spectral Ratio
Orith Toledo-Ronen and Hagai Aronowitz
IBM Research, Israel


 

Paper Identifier: Thu.O9b.04

Thursday

11:00 - 11:20  Grand Ballroom II

 

Speaker Diarization

 

 

Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model
Naohiro Tawara1,  Tetsuji Ogawa1,  Shinji Watanabe2,  Atsushi Nakamura3,  Tetsunori Kobayashi1
1Waseda University, Japan, 2MERL/ NTT CS Lab. Japan, 3NTT CS Lab. Japan


 

Paper Identifier: Thu.O9b.05

Thursday

11:20 - 11:40  Grand Ballroom II

 

Speaker Diarization

 

 

DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings
Deepu Vijayasenan1 and Fabio Valente2
1Universitat des Saarlandes, Germany, 2Idiap Research Institute, Switzerland


 

Paper Identifier: Thu.O9b.06

Thursday

11:40 - 12:00  Grand Ballroom II

 

Speaker Diarization

 

 

I-vectors and ILP clustering adapted to cross-show speaker diarization
Grégor Dupuy,  Mickael Rouvier,  Sylvain Meignier,  Yannick Estève
LIUM, France


 

Paper Identifier: Thu.O9c.01

Thursday

10:00 - 10:20  Pavilion East

 

Speech Production: Imaging and Models

 

 

Emphatic segments and emphasis spread in Lebanese Arabic: a Real-time Magnetic Resonance Imaging Study
Assaf Israel1,  Michael Proctor2,  Louis Goldstein1,  Khalil Iskarous1,  Shrikanth Narayanan2
1Department of Linguistics, University of Southern California, USA, 2Viterbi School of Engineering, University of Southern California, USA and Department of Linguistics, University of Southern California, USA


 

Paper Identifier: Thu.O9c.02

Thursday

10:20 - 10:40  Pavilion East

 

Speech Production: Imaging and Models

 

 

Using magnetic resonance to image the pharynx during Arabic speech: Static and dynamic aspects
Ryan Shosted,  Bradley Sutton,  Abbas Benmamoun
University of Illinois at Urbana-Champaign, USA


 

Paper Identifier: Thu.O9c.03

Thursday

10:40 - 11:00  Pavilion East

 

Speech Production: Imaging and Models

 

 

Articulatory speaker normalisation based on MRI-data using three-way linear decomposition methods
Julián Andrés Valdés Vargas1,  Pierre Badin1,  Laurent Lamalle2
1GIPSA-Lab(France), 2SFR1 RMN Biomédicale et Neurosciences (Unité IRM Recherche 3 Tesla), INSERM, CHU de Grenoble(FRANCE)


 

Paper Identifier: Thu.O9c.04

Thursday

11:00 - 11:20  Pavilion East

 

Speech Production: Imaging and Models

 

 

Vowels Produced by Sliding Three-tube Model with Different Lengths
Takayuki Arai
Sophia University, Japan


 

Paper Identifier: Thu.O9c.05

Thursday

11:20 - 11:40  Pavilion East

 

Speech Production: Imaging and Models

 

 

Estimating the Vocal-Tract Area Function From Formants Using a Sensitivity Function and Least Square
Tokihiko Kaburagi,  Tetsuro Takano,  Yuki Sakamoto
Kyushu University, Japan


 

Paper Identifier: Thu.O9c.06

Thursday

11:40 - 12:00  Pavilion East

 

Speech Production: Imaging and Models

 

 

Modeling source-tract interaction in speech production: Voicing onset vs. vowel height after a voiceless obstruent
Jorge Lucero1,  Laura Koenig2,  Susanne Fuchs3
1University of Brasilia, Brazil, 2Haskins Laboratories, USA, 3Center for General Linguistics, Germany


 

Paper Identifier: Thu.O9d.01

Thursday

10:00 - 10:20  Pavilion West

 

Speech Synthesis

 

 

Modelling a Noisy-channel for Voice Conversion Using Articulatory Features
Bajibabu Bollepalli1,  Alan W Black2,  Kishore Prahallad1
1International Institute of Information Technology, Hyderabad, India, 2Carnegie Mellon University, Pittsburgh, USA


 

Paper Identifier: Thu.O9d.02

Thursday

10:20 - 10:40  Pavilion West

 

Speech Synthesis

 

 

Asymmetries in the perception of synthesized speech
Anna C. Janska1,  Erich Schröger2,  Thomas Jacobsen3,  Robert A.J. Clark4
1IMPRS NeuroCom, University of Leipzig, Germany, 2University of Leipzig, Germany, 3Helmut Schmidt University Hamburg, 4CSTR, The University of Edinburgh, UK


 

Paper Identifier: Thu.O9d.03

Thursday

10:40 - 11:00  Pavilion West

 

Speech Synthesis

 

 

Predicting Character-Appropriate Voices for a TTS-based Storyteller System
Erica Greene1,  Taniya Mishra2,  Patrick Haffner2,  Alistair Conkie2
1University of Southern California, USA, 2AT&T Labs-Research, USA


 

Paper Identifier: Thu.O9d.04

Thursday

11:00 - 11:20  Pavilion West

 

Speech Synthesis

 

 

Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis
Alexander Sorin1,  Slava Shechtman1,  Vincent Pollet2
1IBM Haifa Research Lab, Israel, 2Nuance Communications, Belgium


 

Paper Identifier: Thu.O9d.05

Thursday

11:20 - 11:40  Pavilion West

 

Speech Synthesis

 

 

Pauses and respiratory markers of the structure of book reading
Gerard Bailly and Cécilia Gouvernayre
GIPSA-Lab, France


 

Paper Identifier: Thu.O9d.06

Thursday

11:40 - 12:00  Pavilion West

 

Speech Synthesis

 

 

Proper Name Splicing in Computer Games with TTS
Blaise Potard,  Matthew Aylett,  Christopher Pidcock
CereProc Ltd., UK


 

Paper Identifier: Thu.SS9.01

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Visualizing tool for evaluating inter-label similarity in prosodic labeling experiments
David Escudero-Mancebo1 and Eva Estebas-Vilaplana2
1Universidad de Valladolid, Spain, 2Universidad Nacional de Educacion a Distancia, Spain


 

Paper Identifier: Thu.SS9.02

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Objective, Subjective and Linguistic Roads to Perceptual Prominence - How are they compared and why?
Petra Wagner1,  Fabio Tamburini2,  Andreas Windmann1
1Universität Bielefeld, Germany, 2Università di Bologna, Italy


 

Paper Identifier: Thu.SS9.03

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario
Martin Heckmann
Honda Research Institute Europe GmbH, Germany


 

Paper Identifier: Thu.SS9.04

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Obtaining prominence judgments from naïve listeners -- Influence of rating scales, linguistic levels and normalisation
Denis Arnold1,  Petra Wagner2,  Bernd Möbius3
1Quantitative Linguistics, University of Tübingen, Germany, 2Faculty of Linguistics and Literature, Bielefeld University, Germany, 3Department of Computational Linguistics and Phonetics, Saarland University, Germany


 

Paper Identifier: Thu.SS9.05

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis
Leonardo Badino1 and Robert A.J. Clark2
1RBCS, Istituto Italiano di Tecnologia, Italy, 2CSTR, University of Edinburgh, UK


 

Paper Identifier: Thu.SS9.06

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Investigating syllabic prominence with Conditional Random Fields and Latent-Dynamic Conditional Random Fields
Francesco Cutugno1,  Enrico Leone1,  Bogdan Ludusan2,  Antonio Origlia1
1LUSI-Lab, Dept. of Physics, University of Naples “Federico II”, Italy, 2CNRS-IRISA, Rennes, France


 

Paper Identifier: Thu.SS9.07

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Disentangling lexical, morphological, syntactic and semantic influences on German prominence – Evidence from a production study
Barbara Samlowski1,  Petra Wagner1,  Bernd Möbius2
1Bielefeld University, Germany, 2Saarland University, Germany


 

Paper Identifier: Thu.SS9.08

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation Models
Andrew Rosenberg
Queens College / CUNY, USA


 

Paper Identifier: Thu.SS9.09

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

A Continuous Prominence Score Based On Acoustic Features
Jean-Philippe Goldman1,  Mathieu Avanzi2,  Anne-Catherine Simon3,  Antoine Auchlin1
1University of Geneva, Switzerland, 2University of Neuchatel, Switzerland, 3UCLouvain, Belgium


 

Paper Identifier: Thu.SS9.10

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

More on the Normalization of Syllable Prominence Ratings
Christopher Sappok1 and Denis Arnold2
1University of Duisburg-Essen, Germany, 2University of Tübingen, Germany


 

Paper Identifier: Thu.SS9.11

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

F0 and the Perception of Prominence
Tim Mahrt1,  Jennifer Cole1,  Margaret Fleck2,  Mark Hasegawa-Johnson3
1Department of Linguistics, University of Illinois Urbana-Champaign, USA, 2Department of Computer Science, University of Illinois Urbana-Champaign, USA, 3Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, USA


 

Paper Identifier: Thu.SS9.12

Thursday

10:00 - 12:00  Galleria

 

Prosodic Prominence: Annotation, Prediction, Applications

 

 

Language differences in the perceptual weight of prominence-lending properties
Bistra Andreeva,  William Barry,  Magdalena Wolska
Computational Linguistics & Phonetics, Saarland University, Germany


 

Paper Identifier: Thu.P9a.01

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning
Jun Deng and Bjoern Schuller
Institute for Human-Machine Communication, Technische Universität München, Germany


 

Paper Identifier: Thu.P9a.02

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Using i-Vector Space Model for Emotion Recognition
rui xia and yang liu
utdallas,usa


 

Paper Identifier: Thu.P9a.03

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Cries and Whispers - Classification of Vocal Effort in Expressive Speech
Nicolas Obin
IRCAM-CNRS UMR 9912-STMS


 

Paper Identifier: Thu.P9a.04

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Emotional Speech: A Spectral Analysis
Pouria Fewzee and Fakhri Karray
Centre for Pattern Analysis and Machine Intelligence, University of Waterloo, Waterloo, ON, Canada


 

Paper Identifier: Thu.P9a.05

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Classifying Skewed Data: Importance Weighting to Optimize Average Recall
Andrew Rosenberg
Queens College / CUNY, USA


 

Paper Identifier: Thu.P9a.06

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

Gaze Patterns in Turn-Taking
Catharine Oertel1,  Marcin Wlodarczak2,  Jens Edlund1,  Petra Wagner2,  Joakim Gustafson1
1KTH, Sweden, 2Bielefeld University, Germany


 

Paper Identifier: Thu.P9a.07

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear
Natalie Fecher
University of York, United Kingdom


 

Paper Identifier: Thu.P9a.08

Thursday

10:00 - 12:00  Exhibition Hall

 

Paralinguistics III

 

 

A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic Features
Dogan Can1,  Panayiotis G. Georgiou1,  David C. Atkins2,  Shrikanth S. Narayanan1
1USC, USA, 2UW, USA


 

Paper Identifier: Thu.P9b.01

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Speaker Clustering for a Mixture of Singing and Reading
Mahnoosh Mehrabani and John H. L. Hansen
Center for Robust Speech Systems, University of Texas at Dallas, USA


 

Paper Identifier: Thu.P9b.02

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Automatic Speech Segmentation Using Probabilistic Latent Component Modeling
Sayan Ghosh and T.V. Sreenivas
Dept. of ECE, Indian Institute of Science, India


 

Paper Identifier: Thu.P9b.03

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform
Jonathan Dennis1,  Huy Dat Tran1,  Eng Siong Chng2
1Institute for Infocomm Research, A*STAR, Singapore, 2Nanyang Technological University, Singapore


 

Paper Identifier: Thu.P9b.04

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

AUTOMATIC PHONEME SEGMENTATION USING AUDITORY ATTENTION FEATURES
Ozlem Kalinli
Sony Computer Entertainment America, US R&D, USA


 

Paper Identifier: Thu.P9b.05

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

A Non-Uniform Filterbank for Speaker Recognition
Jia Min Karen Kua,  Tharmarajah Thiruvaran,  Eliathamby Ambikairajah
The University of New South Wales, Australia


 

Paper Identifier: Thu.P9b.06

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Towards an Unsupervised Speaking Style Voice Building Framework: Multi–Style Speaker Diarization
Jaime Lorenzo1,  Beatriz Martínez1,  Roberto Barra-Chicote1,  Verónica López-Ludeña1,  Javier Ferreiros1,  Junichi Yamagishi2,  Juan M Montero1
1GTH-ETSIT-UPM,Spain, 2CSTR-Univ. Edinburgh,UK


 

Paper Identifier: Thu.P9b.07

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

KNNDIST: A Non-Parametric Distance Measure for Speaker Segmentation
Seyed Hamidreza Mohammadi1,  Hossein Sameti2,  Mahsa Sadat Elyasi Langarani2,  Amirhossein Tavanaei2
1OHSU, USA, 2Sharif Uni. Tech., Iran


 

Paper Identifier: Thu.P9b.08

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Lexical Story Co-Segmentation of Chinese Broadcast News
Wei Feng1,  Xuecheng Nie1,  Liang Wan2,  Lei Xie3,  Jianmin Jiang1
1School of Computer Science and Technology, Tianjin University, Tianjin, China, 2School of Computer Software, Tianjin University, Tianjin, China, 3School of Computer Science, Northwestern Polytechnical University, Xi'an, China


 

Paper Identifier: Thu.P9b.09

Thursday

10:00 - 12:00  Exhibition Hall

 

Speech and Speaker Segmentation

 

 

Toward an Optimum Feature Set and HMM Model Parameters for Automatic Phonetic Alignment of Spontaneous Speech
Montri Karnjanadecha1 and Stephen Zahorian2
1Prince of Songkla University, Thailand, 2Binghamton University, USA


 

Paper Identifier: Thu.P9c.01 has been moved to Mon.P1d.10


 

Paper Identifier: Thu.P9c.02

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Automatic Error Recovery for Pronunciation Dictionaries
Tim Schlippe,  Sebastian Ochs,  Ngoc Thang Vu,  Tanja Schultz
Karlsruhe Institute of Technology (KIT), Germany


 

Paper Identifier: Thu.P9c.03

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Confidence measure for speech indexing based on Latent Dirichlet Allocation
Grégory Senay and Georges Linarès
LIA - University of Avignon


 

Paper Identifier: Thu.P9c.04

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Mixed probabilistic and deterministic dependency parsing
Christophe Cerisara and Alejandra Lorenzo
LORIA/CNRS, France


 

Paper Identifier: Thu.P9c.05

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Automatic Vocabulary Adaptation Based on Semantic Similarity and Speech Recognition Confidence Measure
Shoko Yamahata1,  Yoshikazu Yamaguchi1,  Atsunori Ogawa2,  Hirokazu Masataki1,  Osamu Yoshioka1,  Satoshi Takahashi1
1NTT Cyber Space Laboratories, Japan, 2NTT Communication Science Laboratories, Japan


 

Paper Identifier: Thu.P9c.06

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Towards Empirical Dialog-State Modeling and its Use in Language Modeling
Nigel Ward and Alejandro Vega
UTEP, USA


 

Paper Identifier: Thu.P9c.07

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text Mining
Keigo Kubo,  Hiromichi Kawanami,  Hiroshi Saruwatari,  Kiyohiro Shikano
Graduate School of Information Science, Nara Institute of Science and Technology, Japan


 

Paper Identifier: Thu.P9c.08

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Applying multiview learning algorithms to human-human conversation classification
Sokol Koço,  Cécile Capponi,  Frédéric Béchet
LIF Marseille, France


 

Paper Identifier: Thu.P9c.09

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts
Yuya Akita,  Makoto Watanabe,  Tatsuya Kawahara
Kyoto University, Japan


 

Paper Identifier: Thu.P9c.10

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Normalization of Text Messages Using Character- and Phone-based Machine Translation Approaches
Chen Li and Yang Liu
United States


 

Paper Identifier: Thu.P9c.11

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

A Weighted Combination of Speech with Text-based Models for Arabic Diacritization
Aisha S. Azim,  Xiaoxuan Wang,  Sim Khe Chai
Singapore


 

Paper Identifier: Thu.P9c.12

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Understanding

 

 

Using Sub-word-level Information for Confidence Estimation with Conditional Random Field Models
Matthew Stephen Seigel and Phillip Charles Woodland
Cambridge University Engineering Department, United Kingdom


 

Paper Identifier: Thu.P9d.01

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine
Hung-yi Lee,  Yu-yu Chou,  Yow-Bang Wang,  Lin-shan Lee
National Taiwan University, Taiwan


 

Paper Identifier: Thu.P9d.02

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization
Yun-Nung Chen and Florian Metze
Carnegie Mellon University, U.S.A.


 

Paper Identifier: Thu.P9d.03

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Language Modeling for Voice-Enabled Social TV Using Tweets
Junlan Feng and Bernard Renger
AT&T Labs - Research


 

Paper Identifier: Thu.P9d.04

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Detecting OOV Named-Entities in Conversational Speech
Rohit Kumar,  Rohit Prasad,  Sankaranarayanan Ananthakrishnan,  Aravind Namandi Vembu,  Dave Stallard,  Stavros Tsakalidis,  Prem Natarajan
Raytheon BBN Technologies, USA


 

Paper Identifier: Thu.P9d.05

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Unsupervised Deep Belief Features for Speech Translation
Sameer Maskey and Bowen Zhou
IBM


 

Paper Identifier: Thu.P9d.06

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

EuskoParl: a speech and text Spanish-Basque parallel corpus
Alicia Pérez,  José M. Alcaide,  María Inés Torres
Universidad del País Vasco UPV/EHU - Spain


 

Paper Identifier: Thu.P9d.07

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Comparing transcription agreement on non-native English speech corpus between native and non-native annotators
Hyuksu Ryu,  Sunhee Kim,  Minhwa Chung
Seoul National University, Korea, South


 

Paper Identifier: Thu.P9d.08

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

PodCastle: Collaborative Training of Language Models on the Basis of Wisdom of Crowds
Jun Ogata and Masataka Goto
National Institute of Advanced Industrial Science and Technology (AIST), Japan


 

Paper Identifier: Thu.P9d.09

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis
Lei Xie1,  Yinqing Xu1,  Lilei Zheng1,  Qiang Huang2,  Bingfeng Li1
1Northwestern Polytechnical University, China, 2University of East Anglia, UK


 

Paper Identifier: Thu.P9d.10

Thursday

10:00 - 12:00  Exhibition Hall

 

Spoken Language Applications

 

 

Power Mean Pyramid Scores for Summarization Evaluation
Sameer Maskey1 and Andrew Rosenberg2
1USA, 2CUNY


 

Paper Identifier: Thu.O10a.01

Thursday

13:30 - 13:50  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

A Novel Confidence Measure Based on Context Consistency for Spoken Term Detection
Haiyang Li,  Jiqing Han,  Tieran Zheng,  Guibin Zheng
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China


 

Paper Identifier: Thu.O10a.02

Thursday

13:50 - 14:10  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

Discriminatively trained phoneme confusion model for keyword spotting
Panagiota Karanasou1,  Lukas Burget2,  Dimitra Vergyri2,  Murat Akbacak2,  Arindam Mandal2
1LIMSI/CNRS, Universite Paris-Sud, BP133, 91403 Orsay Cedex, France, 2Speech Technology and Research Laboratory, SRI International, Menlo Park, CA, USA


 

Paper Identifier: Thu.O10a.03

Thursday

14:10 - 14:30  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

Inverting the Point Process Model for Fast Phonetic Keyword Search
Keith Kintzley1,  Aren Jansen1,  Kenneth Church2,  Hynek Hermansky1
1Johns Hopkins University, 2IBM Research


 

Paper Identifier: Thu.O10a.04

Thursday

14:30 - 14:50  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

Exploiting Discriminative Point Process Models for Spoken Term Detection
Atta Norouzian1,  Aren Jansen2,  Richard Rose1,  Samuel Thomas2
1McGill University, Canada, 2Johns Hopkins University, US


 

Paper Identifier: Thu.O10a.05

Thursday

14:50 - 15:10  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

Subword speech recognition for detection of unseen words
Ivan Bulyko,  Jose Herrero,  Chris Mihelich,  Owen Kimball
Raytheon BBN Technologies


 

Paper Identifier: Thu.O10a.06

Thursday

15:10 - 15:30  Grand Ballroom I

 

Spoken Term and Unseen Word Detection

 

 

OOV Word Detection using Hybrid Models with Mixed Types of Fragments
Long Qin and Alexander Rudnicky
Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA


 

Paper Identifier: Thu.O10b.01

Thursday

13:30 - 13:50  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

A Conversational Movie Search System Based on Conditional Random Fields
Jingjing Liu,  Scott Cyphers,  Panupong Pasupat,  Ian McGraw,  Jim Glass
MIT


 

Paper Identifier: Thu.O10b.02

Thursday

13:50 - 14:10  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process
Tsung Hsien Wen,  Hung Yi Lee,  Lin Shan Lee
National Taiwan University, Taiwan


 

Paper Identifier: Thu.O10b.03

Thursday

14:10 - 14:30  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

Voice Query Refinement
Cyril Allauzen,  Edward Benson,  Ciprian Chelba,  Michael Riley,  Johan Schalkwyk
USA


 

Paper Identifier: Thu.O10b.04

Thursday

14:30 - 14:50  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

Indexing Raw Acoustic Features for Scalable Zero Resource Search
Aren Jansen and Benjamin Van Durme
Johns Hopkins University, USA


 

Paper Identifier: Thu.O10b.05

Thursday

14:50 - 15:10  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

Lexical-phonetic automata for spoken utterance indexing and retrieval
Julien Fayolle1,  Murat Saraclar2,  Fabienne Moreau1,  Christian Raymond1,  Guillaume Gravier1
1IRISA, Rennes, France, 2Bogazici University, Istanbul, Turkey


 

Paper Identifier: Thu.O10b.06

Thursday

15:10 - 15:30  Grand Ballroom II

 

Voice Search and Spoken Document Retrieval II

 

 

Automating Crowd-supervised Learning for Spoken Language Systems
Ian McGraw,  Scott Cyphers,  Panupong Pasupat,  Jingjing Liu,  Jim Glass
MIT, USA


 

Paper Identifier: Thu.O10c.01

Thursday

13:30 - 13:50  Pavilion East

 

Speech and Age Differences

 

 

An Automatic Child-Directed Speech Detector for the Study of Child Language Development
Soroush Vosoughi and Deb Roy
USA


 

Paper Identifier: Thu.O10c.02

Thursday

13:50 - 14:10  Pavilion East

 

Speech and Age Differences

 

 

Aligning manifolds to model the earliest phonological abstraction in infant-caretaker vocal imitation
Andrew Plummer
The Ohio State University, United States


 

Paper Identifier: Thu.O10c.03

Thursday

14:10 - 14:30  Pavilion East

 

Speech and Age Differences

 

 

The F0 fall delay of lexical pitch accent in Japanese Infant-directed speech
Yoko Saikachi1,  Mafuyu Kitahara2,  Ken’ya Nishikawa1,  Ai Kanato1,  Reiko Mazuka3
1Laboratory for Language Development, Brain Science Institute, RIKEN, Japan, 2School of Law, Waseda University, Japan, 3Department of Psychology and Neuroscience, Duke University, USA


 

Paper Identifier: Thu.O10c.04

Thursday

14:30 - 14:50  Pavilion East

 

Speech and Age Differences

 

 

Children’s Productions of Multi-Syllabic Lexical Stress Patterns in Different Prosodic Positions
Irina Shport
Linguistics Department, University of Oregon, USA


 

Paper Identifier: Thu.O10c.05

Thursday

14:50 - 15:10  Pavilion East

 

Speech and Age Differences

 

 

Prosodic Marking of Continuation versus Completion in Children’s Narratives
Melissa Redford1,  Laura Dilley2,  Jessica Gamache2,  Elizabeth Wieland2
1University of Oregon, USA, 2Michigan State University, USA


 

Paper Identifier: Thu.O10c.06

Thursday

15:10 - 15:30  Pavilion East

 

Speech and Age Differences

 

 

Judging temporal onset differences for concurrent vowels: Results for young, middle-aged, and older adults
Daniel Fogerty1,  Diane Kewley-Port2,  Larry Humes2
1University of South Carolina, USA, 2Indiana University, USA


 

Paper Identifier: Thu.O10d.01

Thursday

13:30 - 13:50  Pavilion West

 

Acoustic Classification

 

 

Combining frame and segment based models for environmental sound classification
Pengfei Hu,  Wenju Liu,  Wei Jiang
Institute of Automation, Chinese Academy of Sciences,China


 

Paper Identifier: Thu.O10d.02

Thursday

13:50 - 14:10  Pavilion West

 

Acoustic Classification

 

 

Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition
Yi Ren Leng and Huy Dat Tran
Institute for Infocomm Research, A*STAR, Singapore


 

Paper Identifier: Thu.O10d.03

Thursday

14:10 - 14:30  Pavilion West

 

Acoustic Classification

 

 

Goal-Oriented Auditory Scene Recognition
Kailash Patil and Mounya Elhilali
Johns Hopkins University, USA


 

Paper Identifier: Thu.O10d.04

Thursday

14:30 - 14:50  Pavilion West

 

Acoustic Classification

 

 

Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams
Ali Ziaei,  Abhijeet Sangwan,  John Hansen
CRSS, UTD, USA


 

Paper Identifier: Thu.O10d.05

Thursday

14:50 - 15:10  Pavilion West

 

Acoustic Classification

 

 

Pooling Robust Shift-Invariant Sparse Representations of Acoustic Signals
Po-Sen Huang,  Jianchao Yang,  Mark Hasegawa-Johnson,  Feng Liang,  Thomas S. Huang
University of Illinois at Urbana-Champaign, USA


 

Paper Identifier: Thu.O10d.06

Thursday

15:10 - 15:30  Pavilion West

 

Acoustic Classification

 

 

Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions
Lee Ngee Tan,  Kantapon Kaewtip,  Martin Cody,  Charles Taylor,  Abeer Alwan
UCLA, USA


 

Paper Identifier: Thu.SS10.01

Thursday

13:30 - 13:45  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

Nasality from Moroccan Arabic Nasal and Pharyngeal Consonants: Patterns of Airflow and Nasalance
Georgia Zellou
University of Colorado, Boulder, USA


 

Paper Identifier: Thu.SS10.02

Thursday

13:45 - 14:00  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

Inter-gestural timing in French nasal vowels: A comparative study of (Liège, Tournai) Northern French vs. (Marseille, Toulouse) Southern French
Véronique Delvaux,  Kathy Huet,  Myriam Piccaluga,  Bernard Harmegnies
UMons, Belgium


 

Paper Identifier: Thu.SS10.03

Thursday

14:00 - 14:15  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

Nasal Coarticulation and Contrastive Stress
Georgia Zellou and Rebecca Scarborough
University of Colorado, Boulder


 

Paper Identifier: Thu.SS10.04

Thursday

14:15 - 14:30  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

An MRI study of the oral articulation of European Portuguese nasal vowels
Catarina Oliveira,  Paula Martins,  Samuel Silva,  António Teixeira
University of Aveiro, Portugal


 

Paper Identifier: Thu.SS10.05

Thursday

14:30 - 14:45  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

Acoustic and Perceptual Similarity in Coarticulatorily Nasalized Vowels
Rebecca Scarborough and Georgia Zellou
University of Colorado at Boulder, USA


 

Paper Identifier: Thu.SS10.06

Thursday

14:45 - 15:00  Galleria

 

New Trends in Vowel Nasalization: The Articulation of Nasal Vowels

 

 

Articulatory differences between oral and nasal vowels based on the simulation of a speaker-adaptive articulatory model
Panying Rong,  Ryan Shosted,  David Kuehn
University of Illinois, Urbana-Champaign


 

Paper Identifier: Thu.P10a.01

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Improving WFST-based G2P Conversion with Alignment Constraints and RNNLM N-best Rescoring
Josef Novak1,  Nobuaki Minematsu1,  Keikichi Hirose1,  Chiori Hori2,  Hideki Kashioka2,  Paul Dixon2
1The University of Tokyo, Japan, 2National Institute of Communication Technology, Kyoto, Japan


 

Paper Identifier: Thu.P10a.02

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction
Jian Luan
Microsoft(China) Corp., China


 

Paper Identifier: Thu.P10a.03

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Perceptual Foundations for Naturalistic Variability in the Prosody of Synthetic Speech
Nanette Veilleux1,  Jonathan Barnes2,  Alejna Brugos2,  Stefanie Shattuck-Hufnagel3
1Simmons College, USA, 2Boston University, USA, 3Massachusetts Institute of Technology, USA


 

Paper Identifier: Thu.P10a.04

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks
Stefan Hahn1,  Paul Vozila2,  Maximilian Bisani2
1RWTH Aachen University, Germany, 2Nuance Communications, Inc., USA


 

Paper Identifier: Thu.P10a.05

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

A Simple Hybrid Acoustic / Morphologically-Constrained Technique for the Synthesis of Stop Consonants in Various Vocalic Contexts
Frederic Berthommier,  Laurent Girin,  Louis-Jean Boe
GIPSA-lab, France


 

Paper Identifier: Thu.P10a.06

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

The IIIT-H Indic Speech Databases
Kishore Prahallad1,  Naresh Kumar1,  Venkatesh Keri1,  Rajendran S1,  Alan W Black2
1International Institute of Information Technology, Hyderabad, India, 2Carnegie Mellon University, Pittsburg, USA


 

Paper Identifier: Thu.P10a.07

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Detecting Acronyms from Capital Letter Sequences in Spanish
Rubén San-Segundo1,  Juan M. Montero1,  Verónica López-Ludeña1,  Simon King2
1Speech Technology Group, E.T.S.I. Telecomunicación. UPM., 2Centre for Speech Technology Research, University of Edinburgh, UK.


 

Paper Identifier: Thu.P10a.08

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Hidden Conditional Random Fields with M-to-N Alignments for Grapheme-to-Phoneme Conversion
Patrick Lehnen,  Stefan Hahn,  Vlad-Andrei Guta,  Hermann Ney
RWTH Aachen University, Germany


 

Paper Identifier: Thu.P10a.09

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Phrase Boundary Assignment from Text in Multiple Domains
Andrew Rosenberg1,  Raul Fernandez2,  Bhuvana Ramabhadran2
1Queens College / CUNY, 2IBM Research


 

Paper Identifier: Thu.P10a.10

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Improved Prediction of Japanese Word Accent Sandhi Using CRF
Nobuaki Minematsu,  Shumpei Kobayashi,  Shinya Shimizu,  Keikichi Hirose
The University of Tokyo, Japan


 

Paper Identifier: Thu.P10a.11

Thursday

13:30 - 15:30  Exhibition Hall

 

Speech Synthesis: Selected Topics

 

 

Articulatory VCV Synthesis from EMA Data
Asterios Toutios and Shinji Maeda
CNRS LTCI; TELECOM ParisTech, Paris, France


 

Paper Identifier: Thu.P10b.01

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Are Sparse Representations Rich Enough for Acoustic Modeling?
Oriol Vinyals1 and Li Deng2
1UC Berkeley, US, 2Microsoft Research, US


 

Paper Identifier: Thu.P10b.02

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition
Yeming Xiao,  Zhen Zhang,  Shang Cai,  Jielin Pan,  Yonghong Yan
Institue of Acoustics, Chinese Academy of Science,China


 

Paper Identifier: Thu.P10b.03

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
Navdeep Jaitly1,  Patrick Nguyen2,  Andrew Senior2,  Vincent Vanhoucke2
1University of Toronto, Canada, 2Google Inc, USA


 

Paper Identifier: Thu.P10b.04

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition
Yanmin Qian and Jia Liu
Tsinghua University, China


 

Paper Identifier: Thu.P10b.05

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Initialization Schemes for Multilayer Perceptron Training and their Impact on ASR Performance using Multilingual Data
Ngoc Thang Vu1,  Wojtek Breiter1,  Florian Metze2,  Tanja Schultz1
1Karlsruhe Institute of Technology, Germany, 2Carnegie Mellon University, USA


 

Paper Identifier: Thu.P10b.06

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Hermitian based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models
Sabato Marco Siniscalchi1,  Jinyu Li2,  Chin-Hui Lee3
1University of Enna Kore, Italy, 2Microsoft Corporation, USA, 3School of Electrical and Computer Engineering, Georgia Institute of Technology, USA


 

Paper Identifier: Thu.P10b.07

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Integrating Deep Neural Networks into Structural Classification Approach based on Weighted Finite-State Transducers
Yotaro Kubo,  Takaaki Hori,  Atsushi Nakamura
NTT Corporation


 

Paper Identifier: Thu.P10b.08

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Parallel Training for Deep Stacking Networks
Li Deng1,  Brian Hutchinson2,  Dong Yu1
1MSR, USA, 2U. Washington, USA


 

Paper Identifier: Thu.P10b.09

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Articulatory Feature based Multilingual MLPs for Low-Resource Speech Recognition
Yanmin Qian and Jia Liu
Tsinghua University, China


 

Paper Identifier: Thu.P10b.10

Thursday

13:30 - 15:30  Exhibition Hall

 

ASR: Deep Neural Networks II

 

 

Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR
Ramón Fernandez Astudillo1,  Alberto Abad1,  João Paulo Neto2
1INESC-ID Lisboa, Portugal, 2INESC-ID Lisboa/IST, Portugal


 

Paper Identifier: Thu.P10c.01

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Complementary Phone Error Training
Frank Diehl and P.C. Woodland
University of Cambridge, UK


 

Paper Identifier: Thu.P10c.02

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Posterior-Scaled MPE: Novel Discriminative Training Criteria
Markus Nussbaum-Thom1,  Zoltan Tüske2,  Georg Heigold3,  Ralf Schlüter1,  Hermann Ney1
1Computer Science Dept. 6, RWTH Aachen University, Aachen, Germany, 2Computer Science Dept. 6, RWTH Aachen University, Aachen, Germanyy, 3Google Research, Mountain View, CA, USA


 

Paper Identifier: Thu.P10c.03

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Improve the Implementation of Pitch Features for Mandarin Digit String Recognition Task
Pei Ding and Liqiang He
Toshiba (China) Research and Development Center, China


 

Paper Identifier: Thu.P10c.04

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Exploring Joint Equalization of Spatial-Temporal Contextual Statistics of Speech Features for Robust Speech Recognition
Hsin-Ju Hsieh1,  Jeih-weih Hung2,  Berlin Chen1
1National Taiwan Normal University, 2National Chi Nan University


 

Paper Identifier: Thu.P10c.05

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Speaker-Dependent Voice Activity Detection Robust to Background Speech Noise
Shigeki Matsuda1,  Naoya Ito2,  Kosuke Tsujino3,  Hideki Kashioka1,  Shigeki Sagayama2
1NICT, Japan, 2University of Tokyo, Japan, 3NTT DOCOMO, Japan


 

Paper Identifier: Thu.P10c.06

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition
Jose A. Gonzalez1,  Antonio M. Peinado1,  Angel M. Gomez1,  Ning Ma2
1University of Granada, Spain, 2The University of Sheffield, UK


 

Paper Identifier: Thu.P10c.07

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Decoding of Uncertain Features Using the Posterior Distribution of the Clean Data for Robust Speech Recognition
Ahmed Hussen Abdelaziz and Dorothea Kolossa
Germany


 

Paper Identifier: Thu.P10c.08

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition
Ning Ma and Jon Barker
University of Sheffield, United Kingdom


 

Paper Identifier: Thu.P10c.09

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Integrating Stress Information in Large Vocabulary Continuous Speech Recognition
Bogdan Ludusan,  Stefan Ziegler,  Guillaume Gravier
CNRS-IRISA, Rennes, France


 

Paper Identifier: Thu.P10c.10

Thursday

13:30 - 15:30  Exhibition Hall

 

Robust Speech Recognition II

 

 

Group Sparse Hidden Markov Models for Speech Recognition
Jen-Tzung Chien and Cheng-Chun Chiang
National Cheng Kung University, Taiwan


 

Paper Identifier: Thu.P10d.01

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast
Johann Poignant1,  Hervé Bredin2,  Viet-Bac Le3,  Laurent Besacier1,  Claude Barras2,  Georges Quénot1
1LIG, France, 2LIMSI, France, 3Vocapia, France


 

Paper Identifier: Thu.P10d.02

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Mask Estimation and Refinement for MFT-based Robust Speaker Verification
Yali Zhao,  Lie Xie,  Zhonghua Fu
Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, School of Computer Science, Northwestern Polytechnical University, China


 

Paper Identifier: Thu.P10d.03

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Sparse Probabilistic Linear Discriminant Analysis for Speaker Verification
Hai Yang,  Chunyan Liang,  Yunfei Xu,  Lin Yang,  Yonghong Yan
Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics, Chinese Academy of Sciences,China


 

Paper Identifier: Thu.P10d.04

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification
Achintya Kumar Sarkar,  D. Matrouf,  P. M. Bousquet,  J. F. Bonastre
LIA, Universite D'Avignon, France


 

Paper Identifier: Thu.P10d.05

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition
Chien-Lin Huang1,  Chiori Hori1,  Hideki Kashioka1,  Bin Ma2
1National Institute of Information and Communications Technology, Japan, 2Institute for Infocomm Research, Singapore


 

Paper Identifier: Thu.P10d.06

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

A method of speaker identification based on phoneme mean F-ratio contribution
Songgun Hyon,  Hongcui Wang,  Chen Zhao,  Jianguo Wei,  Jianwu Dang
School of Computer Science, Tianjin University, China


 

Paper Identifier: Thu.P10d.07

Thursday

13:30 - 15:30  Exhibition Hall

 

Speaker Recognition III

 

 

Mitigating Effects of Recording Condition Mismatch in Speaker Recognition Using Partial Least Squares
Jeremiah Remus,  Jenniffer Estrada,  Stephanie Schuckers
Clarkson University, USA

Thank you to our Sponsors

 

 

 

 

 

“Microsoft is a trademark of the Microsoft group of companies and is used under license from Microsoft.”

 

 

 

 

 

http://www.ets.org/

 

 

“Intel” and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.