Font Size: a A A

Web Subject Information Acquisition System Design And Realization

Posted on:2010-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z A ZhangFull Text:PDF
GTID:2208360275983405Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present, the network develops quickly. It has increasingly become the center of information and the center of media. Every user everywhere of the Internet can obtain all kinds of information from Internet, such as natural, social, political, history, science and technology, education, health, entertainment, political decision-making, finance, business and weather forecast. But, how to gain personal information from the Internet? There is no denying that there are a lot of available tools and methods of information search, but they can not be correctly, automatically get the information we want, which makes a lot of inconvenience. The system proposed can solve this problem. The user can customize the resources and information you want in the Web, regularly update news of the network and integrate the information from the network. All of these functions make users access the appropriate resources simple and fast.This thesis analyses the characteristics of the web pages, in accordance with the characteristics, we offer the method to obtain the contents we need. Regular expression uses certain writing rules to get access to the text string and the content we need. We take the way handle the Web content filtering and gain the content to be required, in order to handle further processing. On its website information collection of, the system can find fixed-site page's information, such as the title and the content, so that users do not have to query-by-page to get all the information effectively. The system has three parts: customization of the web pages, information fetching and the management of the contents. In the first part, we customized the Internet address and the regular expression matching rules we need. Kept them in our information database and prepared to pick up effective information. In the second part, we updated regularly to the latest data matching, that is valid information stored in the database through our rules be applied in our algorithm. In the third part, designed the information management system, that managed the data we stored. We can manage them use a series of increase, delete, change, and search operations. And we set up a special page that allows users to access Web site look up exclusive custom integrated.In this thesis, we used a particular Web site to obtain news and information, demonstrating the superiority and convenience of this method. The method demonstrates good prospects for the development and wide application.
Keywords/Search Tags:regular matching, information collection, Internet page analysis
PDF Full Text Request
Related items