In recent years, the emergency drug-safety accident occur from time to time, these events that exposed a number of problems in the drug supervision. With the rapid progress in economy, the number of pharmaceutical enterprises is increasing. The traditional method has been unable to adapt to the requirements of dynamic monitoring and active supervision, so we designed and developed the Drug Distribution Monitor System to provide strong support for the work of the Drug Administration.This report covers the following aspects: studies the problem of the characteristics of the source of problem drug information and the content structure, studies the most suitable WebCrawler crawling strategy in this system: Link Selection based on pattern matching strategy , and introduce the multi-mode feature matching algorithm Wu_Manber94; discussion of Web-based form DOM parsing information extraction strategy, and information extraction strategies in the table location, data tables found and data extraction. Based on the overall design of the system and detailed design, and successfully link selection based on pattern matching strategies and DOM-based Web information extraction strategies used to form implementation of the system. |