Font Size: a A A

Based The Multidimensional Semantics Internet Drug Information Extraction Research Applications

Posted on:2012-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y L GuFull Text:PDF
GTID:2218330335498580Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, Internet medicine market is expanding rapidly with the development of E-commerce but hidden danger which comes along with it also becomes more and more serious. The Internet being flooded by non-standard or even fake medicine information, regulatory authorities need an advanced Web medicine information extraction and monitoring method to reinforce the supervision over this market. For this purpose, Fudan and Tsinghua University established a joint program of "System Research and Development on Web Medicine Information Management and Intelligent Monitoring Method" and deeply probed into related methods and achieved outstanding results.Traditional Web information extraction methods usually require high manual intervention and lack of adaptability or flexibility, which made recognition of new information sources hard. This paper proposed a multi-dimensional-semantics-based Web medicine information extraction method after researching on related works, which describes semantics related to Web medicine information extraction from multiple dimensions, thus can overcome the heterogeneity and keep the common characteristics of the content and structure from different websites. It also utilizes structural-semantic-entropy to precisely recognize and locate medicine information in web pages. The design and implementation of the multi-dimensional semantic dictionary and the whole information extraction system are discussed in detail in this paper and the method is verified by experiments. Results show that the method is able to greatly reduce manual intervention, yield high precision and recall, and automatically identify medicine information from unknown websites with high flexibility and adaptability.The application of the Web medicine information extraction method proposed in this paper can provide information support for accurate, comprehensive, real-time and automatic medicine information monitoring and intelligent monitoring mechanism for regulatory authorities, which is significant for market regularity and medicine safety.
Keywords/Search Tags:Multi-dimensional Semantic Dictionary, Structural Semantic Entropy, Web Information Extraction, XPath
PDF Full Text Request
Related items