Font Size: a A A

Research On Information Credibility Oriented Web Text

Posted on:2012-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2218330362951567Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There are excessive solutions on the internet when people are unsure about a statement and turn to search engine for help. Untruthful information and rumors on the web will distract people, disturb their judgment, or even mislead them. For this reason, the research of evaluating the credibility of information and helping people get the correct answer is significant.People evaluate the credibility of information by the frequentness on the internet and the confidence of the information source which is called the credibility of information source in this paper. In this paper, many experiments are done to explore this relationship and create a model to evaluate information credibility. The model is designed to evaluate by two layers of the feature space. First layer is the features of information source which may be web site or web page. These features are taken as they are the factors in showing the quality of information source. The second layer of the feature space is based on statistics of abundance search results by search engine. The system classifies the information source into credible information source and doubtful information source based on the first feature layer, moreover, evaluates the credibility of information based on the second feature layer and grade for several alternative answers.The main research contents and innovative points include the following respects:(1) At the stage of extracting candidate set of credible information, the paper proposes the method of removing noises and extracting terms by structure type filter. The structure type filter works out the most similar terms in structure type by pos tagging and named entity recognition. The precision of this stage is 91.25%.(2) At the stage of computing the credibility of information, there is a feature space of two layers which consists of the features of information source and the features of alternative terms. The results of classification by the first layer of feature space will be added in the second layer, and the most credible answer will be computing by the formula of information credibility.(3) According to the algorithms proposed in this paper, we designed and implemented the Information Credibility Evaluation System. The precision of the system is 89.2%.
Keywords/Search Tags:credibility of information, network text, term extraction, credibility computing
PDF Full Text Request
Related items