Font Size: a A A

Based On The Improved Eigenvalue Voice Keyword Extraction

Posted on:2013-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:X S RenFull Text:PDF
GTID:2248330395952897Subject:Education Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology and media application, the proportion of audio-visual resources is increasing rapidly. Text-based retrieval cannot satisfy the requirements and auditory-visual processing cannot be applied for the large data amount, so the emergence of speech retrieval is particularly important. The core issue of audio search is keyword extraction, keyword extraction derived from speech recognition, which is also called Automatic Speech Recognition. It converts the speech to the form which the computer can handle. Keyword extraction is based on the speech recognition, through the further analysis of the result, will get the words which can reflect the content or the subject of the audio signal. Audio keyword extraction is one of the core technology of audio retrieval, because of its important significance and value of research, it has become a research hotspot.In this dissertation, we implement a keyword extraction system based on large vocabulary speech recognition, on the condition that speaker independent and mission independent, and the word confusion network and text-based retrieval technology are also used.We firstly introduce the significance and importance of audio keyword extraction, then introduce some relevant technology used in the audio keyword extraction system, such as the feature in time domain and frequency domain, acoustic model and language model etc. in the following part, we introduce the speech segmentation, the speech recognition, the generation of word confusion network, the keyword extraction and the confidence measure in detail.In the module of speech segmentation, we improve the traditional feature of audio to get one that has better discrimination. Such as the short time energy is be improved as low short time energy rate. Then the improved eigenvalue can be obtained for audio segmentation to get the conversation parts.In the modular of keyword extraction, the word lattice which is the result of the speech recognition will be processed to get the word confusion network. Then the word confusion network will be used to search the keyword, finally, the search result will be given a confidence measure to get the final keyword result. The experimental results show that the speech recognition result of the segmented audio is better than the one not, and the detection rate can reach73%.
Keywords/Search Tags:speech keyword extraction, improved eigenvalue, speechsegmentation, word confusion network
PDF Full Text Request
Related items