Research On Web Information Selection Based On Credibility And Semantic Similarity

Posted on:2017-01-01

Degree:Master

Type:Thesis

Country:China

Candidate:X C Zheng

Full Text:PDF

GTID:2308330488961134

Subject:Library and Information Science

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology, the Internet has become a huge, global information service center, and it’s the primary source to access information and knowledge of people. However, due to the openness and unbounded of Internet, The quality of information on the Internet is uneven, filled with a lot of false, incorrect and useless information. In the face of the vast, bad information on the Internet, people usually use the major search engines to find their required information. However, the mainstream search engine as a business tool, its search results do not make users feel particularly satisfied:on the one hand, it cannot guarantee reliable quality web top surface; on the other hand, it may contain a large number of duplicate and reproduced pages. This greatly affects the efficiency of users access to information, but also is a waste of time and effort to filter the information of users. Therefore, this paper proposes a web information selection method based on credibility and semantic similarity, which aims to reduce the burden of people to access high quality and high reliability information from the Internet, and improve the efficiency of web page information selection.In this paper, firstly, on the basis of comprehensive investigation and systematic analysis of the existing related research at home and abroad, summarizes the relevant theoretical research results and technical methods. Secondly, focused on the construction of the Web information credibility evaluation system, and divides it into three levels:authoritative of sources, significance of content and web page relevance, each level also set more specific evaluation indexes, through expert scoring method and analytic hierarchy process to determine the weight of each index, and gives the calculation formula of web page information, credibility. Thirdly, focusing on analysis the DOM tree structure of the web page of text extraction method and realization process on the basis of detailed analysis of the content and structure of web, and the LDA topic model is applied to the web page semantic similarity calculation, and proposed a method of web page semantic similarity calculation based on LDA topic model, and analyzes the process of its implementation in detail. Finally, this paper designs and implements a web information selection system based on credibility and semantic similarity. The function of each module is analyzed in detail, and the validity and practicability of the proposed method are verified by experiments and results analysis.

Keywords/Search Tags:

Web information credibility, information selection, semantic similarity, DOM, LDA topic model

PDF Full Text Request

Related items

1	Mongolian Short Text Semantic Similarity Calculation Based On Deep VAE Integrated With Topic Information
2	Research On Short Text Topic Information Mining Technology
3	Research On Topic Modeling Method Based On Semantic Distribution Similarity
4	Book Topic Selection Information Collection Model Research In The Era Of Big Data
5	Research On The QoS Service Selection Based On The Historical Choice Information
6	Research On Semantic Textual Similarity Model Based On Conceptual Information Content
7	Web Information Credibility Research Based On Information Fusion
8	Application And Research Of Topic Model In Gene Semantic Similarity Calculation
9	Research On Semantic Representation Of Text Based On Topic Model
10	Research On BBS Topic Detection And Tracking