Font Size: a A A

Medical Advertisments Monitoring System Based On Web Content Mining

Posted on:2012-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:R P DouFull Text:PDF
GTID:2218330368478118Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the booming of online economy, more and more advertisers are attracted to appear in the online advertisement market by a large number of netizens. The number of network advertisement has grown rapidly; meanwhile, many illegal advertisements emerge. Illegal medical advertisements are the most harmful to the society. It's impossible that collecting and handling vast online information manually. Interrelated information technologies should be reinforced, an online medical advertisement monitoring system is necessary to the solution of the problem.This dissertation does a lot of research work on Web crawler, Web information extraction, web pages classification. Furthermore, an online medical advertisements monitoring system is proposed using these techniques. This system can monitor online medical advertisements well. This thesis's work includes the following aspects:1. There is a detailed explanation for Web crawler's theory after doing lots of research work. Based on Webpage's structure, the method of Web information extraction is proposed using open source tools. The experiments show the effectiveness and accuracy of the proposed methods.2. This dissertation introduces the development of web page classification, and then discusses each step of the working flow. On the basis of these theories, we propose a method of distinguish medical data from all data which Web crawler acquires. We make the best use of open source tools, and improveχ2 algorithm's defect in text categorization. The effectiveness of the proposed methods is showed by the experiments.3. A distributed processing system for monitoring online medical advertisements is designed and implemented. It can track and handle online medical advertisements automatically; moreover, it has friendly interface and strong operation.
Keywords/Search Tags:Web content mining, Web crawler, Web information extraction, Web page classification
PDF Full Text Request
Related items