Font Size: a A A

A Study On Web Information Extraction Algorithm And Agricultural Application

Posted on:2014-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:W P LiFull Text:PDF
GTID:2268330425991327Subject:Agricultural information technology
Abstract/Summary:PDF Full Text Request
In order to obtain the desired agricultural products, people need to obtain accurate agricultural information. As a very crucial aspect in people’s production and living, agricultural information has been playing an increasingly important role. The rapid development of Internet provides a very convenient platform for the delivery of agricultural information. With the fast development of computer technology and the rapid popularity of network applications, the number of information has been increasing rapidly. Whether people can extract useful information from massive amounts of data and make use of them has become the key to agricultural decision-making affairs, while searching related information on the Internet by using the computer has become a vital means of obtaining information.This paper studies some certain knowledge of Web information extraction from algorithm of information extraction based on visual characteristics through learning the structural characteristics of Web page, and then applies this improved algorithm to an actual agricultural infthe aspect of agricultural information and makes an in-depth research on the improved ormation Website.Firstly, this paper makes a simple description of Web information extraction technology by starting from the historical developments at home and abroad.Secondly, this paper makes clarifications relating to the emergence and development of Web information extraction technology, and conforms the concept of Web information extraction, and then classifies Web information extraction technology according to different criteria.Thirdly, this paper analyzes and summarizes the characteristics of the Web page by starting from the structural description of the Web page. Moreover, this paper discusses VWIEA algorithm based on visual features in details, including the focuses on the extraction of visual block, simplification of DOM tree, removement of Web page noise and adjustment of page structure with some relevant parameters.Finally, this paper establishes the algorithm template and design the experiment on the basis of the algorithm. Through the test set which is obtained from the search results of Bing and the comparison of improved system with Bing, this paper proves that VWIEA improved algorithm based on visual characteristics not only improves the retrieval accuracy, but also improves the test value of Fl.
Keywords/Search Tags:agricultural information, visual characteristics, Web page noise, DOM tree, structuring
PDF Full Text Request
Related items