Important Dates

  • April 1, 2012
    Full Paper Submission Deadline
  • June 8, 2012
    Notification of Paper Acceptance
  • June 16, 2012
    Grant Application Deadline
  • June 22, 2012
    Camera-ready Paper Due
  • June 30, 2012
    Early Registration Deadline
    Deadline for Presenters to Register
  • August 8, 2012
    Hotel and Standard Registration Deadline

Join our mailing list!

Organizing Secretariat

Conference Solutions

Privacy-Preserving Speech Processing

Abstract

Speech is one of the most private form of personal communication, yet, current speech processing techniques are not designed to preserve speaker privacy and require complete access to the speech data. In this tutorial we study privacy-preserving techniques for speech processing applications focusing on speaker verification and identification. A speaker verification system uses the speech input to authenticate the user. We discuss privacy-preserving speaker verification, where the system is able to perform authentication without observing the speech input provided by the user and the user does not observe the speech models used by the system. These privacy criteria are important in order to prevent an adversary having unauthorized access to the user's client device or the system data from impersonating the user in another system. We develop two privacy-preserving algorithms for speaker verification: firstly, we use Gaussian mixture models (GMMs) and create a homomorphic encryption based protocol to evaluate GMMs over private data. Secondly, we apply locality sensitive hashing (LSH) and one-way cryptographic functions to reduce the speaker verification problem to private string comparison.

 

Speaker identification is a related problem where we are interested in identifying the speaker among a given set of speakers best corresponding to a given speech sample. This task finds applications in surveillance applications, where a security agency such as the police has access to a speaker models for individuals, e.g., a set of criminals it is interested in monitoring and an independent party such as a phone company might have access to the phone conversations. The agency is interested in identifying the speaker participating in a given phone conversation among its set of speakers. The agency can demand the complete recording from the phone company if it has a warrant for that person. By using a privacy-preserving speaker identification system, the phone company can provide the privacy guarantee to its subscribers that the agency will not be able to obtain any phone conversation for the speakers that are not under surveillance. Similarly, the agency does not need to send the list of speakers under surveillance to the phone company. Speaker identification can be considered to be an extension of speaker verification to the multiclass setting. We extend the GMM-based and LSH-based approaches to create analogous privacy-preserving speaker identification frameworks.

 

As this is an emerging field that has not yet been exhaustively studied, we will have the opportunity to cover its theory from first principles. Due to that, the target audience of this tutorial can be very wide and will not be expected to have any prior experience in this area. The target participants of this tutorial are speech researchers without any background in privacy, We hope that this tutorial will enable these researchers to develop privacy-preserving variants of their speech processing algorithms, and help foster cross-pollination between these two research areas.

 

Outline

Introdcution

  • Motivations
  • Overview of the tutorial

Speech Processing Background

  • Speech-Processing Basics
    • Gaussian Mixture Models & Hidden Markov Models
    • Supervector Framework
    • Algorithms for Speaker Verification
    • Algorithms for Speaker Identification

Privacy Background

  • What is Privacy?
  • Adversarial Models
  • Tools
    • Homomorphic Encryption
    • Cryptographic Hash Functions
  • Fundamental Protocols
    • Private Inner Product
    • Private Comparison

Privacy-Preserving Speaker Verification

  • Problem Overview
  • Privacy Issues
  • Privacy-Preserving Algorithms
    • using Gaussian Mixture Models
    • using Supervectors
  • Experiments

Privacy-Preserving Speaker Identification

  • Problem Overview
  • Privacy Issues
  • Privacy-Preserving Algorithms
    • using Gaussian Mixture Models
    • using Supervectors
  • Experiments

Privacy-Preserving Speech Recognition

  • Problem Overview
  • Privacy Issues
  • Basic Privacy-Preserving Algorithm
    • using Hidden Markov Models
  • Experiments

Conclusions

  • Brief recapitulation
  • Current trends and ideas
  • Privacy in other speech processing tasks

 

Short Bios

Manas Pathak

Manas A. Pathak Carnegie Mellon University, USA

Manas A. Pathak is a Ph.D. candidate in the Language Technologies Institute at Carnegie Mellon University. He received his M.S. from Carnegie Mellon University in 2009 and his B.Tech in Computer Science from Visvesvaraya National Institute of Technology, Nagpur, India in 2006. He has done internships at PARC, MERL, IBM Research, and has 10 papers published in various conferences and journals and 4 patents pending. His research interests lie at the intersection of data privacy, machine learning, speech processing.

manasp@cs.cmu.edu
http://www.cs.cmu.edu/~manasp/

 

Bhiksha Raj

Bhiksha Raj Carnegie Mellon University, USA

 

Bhiksha Raj is an associate professor and non-tenured faculty chair at Carnegie Mellon University's Language Technologies Institute, and also holds the position of associate professor by courtesy in the Electrical and Computer Engineering department at CMU. Dr. Raj obtained his PhD from CMU in 2000 and was at Mistubishi Electric Research Laboratories from 2001-2008. Dr. Raj's chief research interests lie in robust automatic speech recognition, machine learning and associated topics. Since 2005 he has also investigated topic models for signal processing, particularly in the context of modelling, enhancing and modifying speech signals, and has published several papers on the topic.

bhiksha@cs.cmu.edu
http://www.cs.cmu.edu/~bhiksha/

 

 


 

 

Thank you to our Sponsors

 

 

 

 

“Microsoft is a trademark of the Microsoft group of companies and is used under license from Microsoft.”

 

 

 

http://www.ets.org/

 

 

“Intel” and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.