Font Size: a A A

Hidden Markov Model Based Multi-truth Discovery

Posted on:2020-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:W W HuangFull Text:PDF
GTID:2428330596981786Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The Internet is an important source of human data acquisition.Therefore,how to obtain accurate and usable data from the massive data on the Internet has become a research central issue.Information integration,Question Answering System,and knowledge discovery are all closely related to the information acquisition technology of the Internet.With the entry of human society into the Internet era,human beings have left a variety of data on the Internet,including social,shopping,reading,and entertainment and so on.And the acquisition of data seems simple and convenient.However,the explosive growth of data has also made it difficult to obtain information.It is a time-consuming and laborious task to screen out valuable information from a large amount of data.At the same time,large number of data sources exist on the Internet,and the information provided by some data sources may be wrong,missing or expired,so the description of the same object is not guaranteed to be completely consistent,which makes the use of data more confusing.In the case of Deep Web,many data providers provide descriptions of the same object,and these data providers may have a certain degree of processing on data.How to mine the true value in the massive data is a problem worth studying.In the era of big data,data is extremely valuable,how to use data is always a hot topic.And how to ensure the consistency and accuracy of data is a crucial step in data usage.The research on graph model and truth discovery has experienced a long time of development.Applying the graph model to truth discovery will help to optimize the model results and obtain better true value prediction results.In this paper,a graph model based truth discovery problem is studied.Book information on website such as Douban and youlu.net are captured through web crawler,and a graph-based truth discovery model is constructed in this paper.Experiments verify the validity if the model.The innovations of this paper are summarized as follows:1.A method for finding the initial truth value in multi-truth discovery is proposed.This method is based on the traditional voting method,which improves the limitation of voting method in multi-truth discovery application.And effectively improves the accuracy of subsequent truth discovery process.2.Construct a graph-based truth discovery model,and a method to calculate support value between facts is introduced.Proposed a truth discovery algorithm based on graph model.3.The theoretical model proposed in this paper is implemented and tested on the book-author dataset,which proves the effectiveness of the proposed method.The comparison experiment also shows that the choice of the initial truth has a certain influence on the calculation of the final true value.This paper constructed a truth discovery on the book-author dataset based on the graph model.Learn from transition probability of the hidden Markov model,the trust probability matrix of the data description in the conflict data source is obtained.The convergence value of the reliability of data description can be calculated according to the transfer matrix.And a method for determining the initial truth in the multi-truth discovery algorithm named CVote is proposed.This paper implements the proposed truth discovery model on the book-author dataset and compares it with the existing research models.It proves the effectiveness of the proposed method.Finally,this paper provides an alternative idea for finding truth in conflict data sources.
Keywords/Search Tags:truth discovery, graph model, support degree, multiple truth, credibility
PDF Full Text Request
Related items