Yang Peng, Xie Lei, Zhang Yanning. Survey on unsupervised spoken term detection for low-resource languages[J]. Journal of Image and Graphics, 2015, 20(2): 211-218. DOI: 10.11834/jig.20150207.
Query-by-example spoken term detection for low-resource languages has recently drawn considerable research interest. For low-resource languages that lack sufficient annotated data and related expert knowledge
spoken term detection techniques based on traditional large vocabulary speech recognition cannot be directly used. Researchers have recently attempted to determine an unsupervised technique to perform this task for low-resource languages. In this study
we first present the challenges confronting this task. We then introduce the algorithm framework based on dynamic time warping (DTW) commonly used in this task. We finally present the recent research devoted to feature representation
template matching
speed-up
and other related topics. Although the research of this technique on low-resource language has got much progress
there are not real-life applications. Some unified feature representation and indexing method must be proposed to attain both good effectiveness and efficiency. We present the commonly used performance evaluation standards. The conclusion of our investigation is presented
and possible future research directions are discussed.