Application of Structural Events Detected on ASR Outputs for Automated Speaking Assessment

Lei Chen and Su-Youn Yoon



We investigated the features reflecting utterance structure and disfluency profile to improve the automated scoring of spontaneous speech responses by non-native speakers of English. On both human annotated structural events (SEs), e.g., clause structure and disfluencies, and automatically detected SEs on speech transcriptions, several features were derived and showed promisingly high correlations to the human proficiency scores. However, the usefulness of these SE-derived features on ASR hypotheses was still unknown. In this paper, we reported our studies related to the detection of SEs from noisy ASR outputs and the application of the detected SEs for automated speech scoring. We found that clause boundary (CB) detection was impacted much less compared to interruption point (IP) (of speech disfluencies) detection when facing ASR errors. Next, several features derived from the detected SEs were evaluated by considering their correlation to human scores and their relative importance in a linear regression model.

Thank you to our Sponsors






“Microsoft is a trademark of the Microsoft group of companies and is used under license from Microsoft.”





“Intel” and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.