In recent years, speech recognition systems started to be widely used inmobile devices. As a requisite step in speech processing, voice activedetection (VAD) contributes to distinguish the speech interval from thenon-speech interval in the streaming digital signals. Based on the detectionresult, speech system could abandon the non-speech data to benefit bothperformance and energy consume. Especially on embedded device, it requiresa less complexity but still effective VAD solution.Firstly, the pre-processing step is provided to prepare for featureextraction, containing framing, pre-emphasis and speech enhancement. Thena low-power solution is proposed. The feature is selected from classictime-domain and newest popular features to represent speech distinction. Thedecision strategy is optimized based on threshold comparison. Experimentalresult shows it works well for high SNR (≥10dB) environment.Then, to deal with low SNR cases, an algorithm combined theMFCC-GMM model classifier and time-domain feature in innovation ways isproposed. It brings robustness performance in low signal noise ratio popularnoise environment. And in annoying babble noise situation, the proposedalgorithm has lower false-alarming rate than the other VAD solutions.At the end, a sample application, named “wake on voice” in androiddevice is implemented to testify the practicability and adaptability of theproposed solutions, while the APIs are packaged and reusable in any mobilespeech system... |