Important Dates

  • April 1, 2012
    Full Paper Submission Deadline
  • June 8, 2012
    Notification of Paper Acceptance
  • June 16, 2012
    Grant Application Deadline
  • June 22, 2012
    Camera-ready Paper Due
  • June 30, 2012
    Early Registration Deadline
    Deadline for Presenters to Register
  • August 8, 2012
    Hotel and Standard Registration Deadline

Join our mailing list!

Organizing Secretariat

Conference Solutions

Domain Adaptation in Machine Learning and Speech Recognition

Most learning algorithms for pattern recognition assume that the training data and test data come from the same distribution. While this assumption enables convenient theoretical analysis and controlled testing of algorithms, it rarely holds in practice. Speech processing systems, for example, must deal with significant variability in speakers, background noise, channel characteristics and higher-level effects such as drift in genre and conversation topics over time. Inevitably, real-world data exhibits variability that has not been captured in training sets.

 

The problem of a mismatch between the training and test distributions has also received a lot of research attention in other application areas, including text processing, computer vision, and bioinformatics. A plethora of techniques has been proposed; in the machine learning community they are known under the names of domain adaptation, covariate shift, sample selection bias, and transfer learning. In speech processing, many classical techniques can be seen as instances of domain adaptation, e.g. robustness and speaker adaptation in speech recognition, compensation for intersession variability in speaker identification, and adaptation of language models to new domains. While some techniques from speech processing exploit specific characteristics of speech signals, many techniques from the two communities share similar intuitions about how to model and overcome shifts in data distributions.

 

This tutorial aims to bridge the gap between the speech and machine learning communities, share research results, and foster and inspire collaboration for addressing the challenge of domain adaptation. While the presenters will employ specific examples of techniques of domain adaptation from concrete application areas such as speech, language processing, and computer vision, they will also focus on general frameworks and theoretical underpinnings, thus providing essential tools and ideas for attendees to address the challenge of domain adaptation in the broad sense, and to make fundamental contributions to the field.

 

Outline

  • Introduction
    • Problem setting and motivating examples
    • Relationship between domain adaptation and other learning problems
    • Notation
  • Covariate shift: how to eliminate changes in data distributions by data selection and reweighting
    • Data selection for language model domain adaptation
  • Feature-based approaches: how to infer robust, domain-invariant or shared representations
    • Linear transformations: MLLR and FMLLR, nuisance attribute projection, structural correspondence learning, and canonical correlation analysis
    • Nonlinear projection methods: manifold alignment, maximum mean discrepancy
  • Model-based approaches: how to infer or modify models across domains
    • MAP adaptation, regularization with priors, and model interpolation
    • Bayesian approaches for modeling shared parameters
  • Summary and conclusion

 

Biographies of the Speakers

Fei Sha is an assistant professor at the University of Southern California, Dept. of Computer Science. His primary research interests are machine learning and application to speech and language processing, computer vision and others. He wrote his PhD thesis on large margin based parameter estimation techniques for hidden Markov models. He has also worked extensively in dimensionality reduction. He has won outstanding paper awards at NIPS and ICML.

 

Brian Kingsbury is a research staff member at the IBM T. J. Watson Research Center, Yorktown Heights, NY. His research interests include large-vocabulary speech transcription, audio indexing and analytics, and information retrieval from speech. He is one of the developers and maintainers of IBM's Attila speech recognition toolkit, and he has contributed to IBM's entries in numerous competitive evaluations of speech technology, including Switchboard, SPINE, EARS, Spoken Term Detection, and GALE.

Thank you to our Sponsors

 

 

 

 

“Microsoft is a trademark of the Microsoft group of companies and is used under license from Microsoft.”

 

 

 

http://www.ets.org/

 

 

“Intel” and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.