Font Size: a A A

Research On Topic Detection And Tracking Model Based On Belief Network

Posted on:2016-04-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:S F WuFull Text:PDF
GTID:1228330479478357Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The successful application of VSM in Topic Detection and Tracking(TDT) theoretically proves the Bayesian network retrieval models can also be used in TDT. Belief network model is one of retrieval models based on Bayesian network. Our studies attempt to use belief network model to model topics, which will propose a new research method for TDT.Feature selection is the basis of topic modeling, mutual information is one of effective feature selection methods in text processing. Based on the basic mutual information, we will cluster those stories having the same high-frequency terms to compute term’s mutual information after clustering. Moreover, the occurrence time of new tracked related story and topic is quantified as time distance to inverse influence the dynamical updating of mutual information. Finally, a dynamic mutual information method based on clustering is proposed,which will be used to compute term weight of news stories. In order to determine the feature subset size of original topics of TDT4, an object function based on the thought of the distance of within-class is minimum and the distance of between-class is maximum is given, which will be solved by coordinate descent method.Based on belief network model and the characteristics of news stories, four topic models BSTM-I, BSTM-II, BDTM-I and BDTM-II are given. BSTM-I includes three categories nodes: news story nodes, term nodes, topic nodes, arc reflects the affiliation between nodes.Based on BSTM-I, BSTM-II adds event nodes and adjusts term weights twice to embody the importance of terms in seminal stories and seminal events. In BSTM-II, nodes and arcs have the same meaning of BSTM-I. BDTM-I belongs to dynamic topic model, whose nodes types and arcs are same as BSTM-I. Different to BSTM-I, the term level of BDTM-I will dynamical updating in the process of topic tracking. For re-appearing terms, whose weight will be updated by sum-average, new appearing terms will be inserted into term level. All the above three models use the traditional modeling method and have the same advantages and disadvantages as previous models. BDTM-II breaks the traditional modeling thought and usesthe advantage of belief network model providing a flexible framework to make BDTM-II includes two types of nodes: original seminal term nodes and updating term nodes, and uses disjunctive method to merge these two evidence nodes. According to the topology of these models, the probability inferences of BSTM-I, BSTM-II, BDTM-I and BDTM-II are given to decide the new story whether related to topic.Through analyzing the false alarm reasons of dynamic topic tracking, Error detection in dynamic topic tracking is proposed. We firstly analyze time distance, difference relationship,distribution relationship and the similarity between new tracked story and the seminal contents of topic how to influence error detection, and then obtaining the computation method of error detection to decide a new tracked related story whether belongs to false alarm. If the related story belongs to false alarm, some feature’s weights will decay and need to decide whether adjusting the structure of model. Experiments adopt TDT4 corpora and DET curves to testify the reasonable and effectiveness of the above researches.
Keywords/Search Tags:Topic Detection and Tracking, Belief Network, Topic Model, Error Detection, Feature Selection, Mutual Information
PDF Full Text Request
Related items