Font Size: a A A

Research On Confidence Measure For Chinese Spoken Term Detection

Posted on:2015-06-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y LiFull Text:PDF
GTID:1228330422492447Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech is one of the most convenient and natural ways of exchanging information, and it is widely used in human society. Speech data are growing with a high speed in the communication and the Internet. Therefore, speech analyzing and processing are broadly applied to obtain useful information. As a key technique of information retrieve in speech, spoken term detection (STD) is the task which aims to locate all occurrences of terms queried by a user in large audio archives. STD has an expansive application prospect in speech analysis, information retrieve, data mining, information security and so on.Recently, STD has become a hot topic and attracts extensive and in-deep researches. There are many achievements for STD. However, errors are inevitable in the results of detection, which reduces the performance of the STD system. Considering the credibility of the results, the cause of the errors is that the STD system gives lower confidence mea-sures to the correct detections while gives higher ones to the false alarms. Therefore, the effective confidence measure is important for STD. However, not only the training crite-ria of the current methods are inconsistent with the one of evaluation, but also the current methods are lack of effective usage of high-level linguistic information. Meanwhile, for the detection of out-of-vocabulary terms, low recall rate is also a problem needed to be solved, and there is no effective method of confidence measure. To deal with the above problems, we study the confidence measure of STD and propose many novel methods for two aspects, including the detections of in-vocabulary (INV) and out-of-vocabulary (OOV) terms. The following is the main contributions:(1) To deal with the inconsistence between training and evaluation criteria, we pro-pose a method of confidence measure based on the area under the receiver operating characteristics (ROC) curve (AUC). This method employs AUC as the objective function with the acoustic features of the hypotheses. Firstly, we propose a confidence measure of weighted phone-level confidence for the in-vocabulary term detection. Furthermore, we propose the feature vector of syllable confidence according to the structure characteristic of Chinese syllable. Based on the feature vector of syllable confidence, we also propose a method of the weighted syllable-level confidence and AUC maximization based on the feature vector of syllable confidence. For STD, the AUC maximization is more practical than the minimum classification error for training the weighting factors, since AUC is closer to the evaluation criteria.(2) To use the high-level linguistic information, we propose a method of confidence measure based on context consistency. This method considers the uncertainty of context and the topic effect to compute the context consistency, which is employed as the confi-dence measure. The uncertainty of the context is estimated by the occurrence probabilities of the context words, which are computed by combining the overlapping hypotheses in lattice. To handle the effect of topic, we propose a method of topic adaptation. The adap-tation method firstly classifies the spoken document according to the topics and then com-putes the context consistency of the hypothesized word with the topic-specific measure of semantic similarity. Since the uncertainty of the context and the topic adaptation are considered, the context consistency is more accurate. Experimental results show that both the uncertainty of context and the topic adaptation are effective for confidence measure.(3) To solve the problem of low recall rate, we propose a method to detect the OOV terms. In this method, the syllable sequences with similar pronunciations to the original query term are also regarded as the query terms for searching. We estimate the confusion between the original query term and the expanded query terms with Kullback-Leibler (KL) divergence during the expansion of spoken term. We propose a novel method to approximate the KL divergence based on the upper and lower bounds, which can make the approximation more accurate. Thus, the measure of confusion can handle the syllable insertion and deletion, and it can compare the syllable sequence with different lengths. Then, we construct a tree index of syllable n-gram to speed up the retrieval of the expand-ed terms. Experimental results show that term expansion can raise the recall rate of OOV term detection. The proposed confidence measure can improve the performance of OOV term detection.(4) To compute the confidence measure for OOV terms, we propose a method of con-fidence measure based on the correlation between hypothesis segments to make full use of the relationship between hypothesis segments. The proposed method re-estimates the confidence measure after locating the hypotheses. Firstly, we propose a method based on force-alignment to get accurate bounds of hypothesis. Then, we also propose a confidence measure using likelihood ratio to compute an initial confidence measure for re-estimation. Moreover, we propose two methods of re-estimation for confidence measure including the methods based on feedback and random walk. The former one collects the pseudo rele- vant and irrelevant hypotheses and re-estimates the confidence measure by feedback. The latter one constructs the model of random walk with correlation between hypotheses to re-estimate the confidence measure. Experimental results show that the proposed methods of confidence measure are helpful to improve the OOV term detection.
Keywords/Search Tags:Spoken term detection, confidence measure, out-of-vocabulary term, seman-tic similarity, term expansion, random walk
PDF Full Text Request
Related items